THE MOBIUS FUNCTION IS STRONGLY ORTHOGONAL TO 

NILSEQUENCES 

BEN GREEN AND TERENCE TAG 



Abstract. We show that the Mobius function /x(n) is strongly asymptotically or- 
thogonal to any polynomial nilsequence {F{g{n)T))ni£n- Here, G is a simply-connected 
nilpotent Lie group with a discrete and cocompact subgroup F (so G/T is a nilmani- 
^— V ' fold), g : Z —>■ G is a polynomial sequence and F : G/T — >■ M is a Lipschitz function. 

CN ■ More precisely, we show that \jfY.n=i M("-)^(ff("-)r)| <F,G,r,A log"^ N for all A> 0. 

\^ I In particular, this implies the Mobius and Nilsequence conjecture MN(s) from our ear- 

MHi lier paper [8] for every positive integer s. This is one of two major ingredients in 

■^^ ' our programme in [SI to establish a large number of cases of the generalised Hardy- 

vQ . Littlewood conjecture, which predicts how often a collection V'l , • ■ • , V't • ^"^ ^^ ^ of 

^vl I linear forms all take prime values. The proof is a relatively quick application of the 

results in our recent companion paper fSj. 

We give some applications of our main theorem. We show, for example, that the 
Mobius function is uncorrelated with any bracket polynomial such as n^/3[n^/2\ . We 
also obtain a result about the distribution of nilsequences (a"xr)„gN as n ranges only 



(~| . over the primes. 



^ ' 1. Introduction 

> 

en . Important remark. This paper is intimately tied to, and is intended to be read 

l^^ I in conjunction with, the longer companion paper [9j, which proves results about the 

distribution of finite polynomial orbits on nilmanifolds. In particular, we shall make 

Q . heavy use of the notation and lemmas from that paper. 

00 ' 

Q I The aim of this paper is to establish what the authors have been referring to as the 

Mobius and Nilsequence conjecture MN(s), first stated as [HI Conjecture 8.5]. Roughly 

speaking, this states that the Mobius function fi{n), defined as (—1)'^ when n is the 

product of k distinct primes, and otherwise, is asymptotically strongly orthogonal to 



X 



c3 I any Lipschitz s-step nilsequence (F(a"a;))„gz, in the sense that the inner product 

EnG[JV]/^H^(a"a;) 

of these two functions on [N] := {1,...,A^} decays to zero faster than any fixed 
power of 1/logiV. Here and in the sequel we use the averaging notation 'Kx&xf{x) '■ = 
ITT Sxex /(•^) ^^^ ^^y finite set X. Recall also that an Lipschitz s-step nilsequence is 
any sequence of the form F{a'^x), where a is an element of an s-step connected and sim- 
ply connected nilpotent Lie group G, x is an element of the nilmanifold G/T for some 
discrete cocompact subgroup F ^ G of G, and F : G/T — )■ M is a Lipschitz function. 

The difficulty of this conjecture increases with s. The case s = of this conjecture is 
the estimate 

1 
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The stronger estimate 

E„6[jv]/i(n) « e-^v^^^ 
is essentially equivalent to the prime number theorem (with classical error term). 

The case s = 1 may be reduced by Fourier analysis to the estimate 

|E„6[^]Ai(n)e(an)| <^ log'^AT (1.1) 

where e{x) := e^'^*^, required to hold uniformly for all a G M. This was established 
by Davenport [3] in the 1930s by modifying Vinogradov's method of bilinear forms (or 
"Type I and Type II sums"). 

In the case s = 2 the conjecture was established by the authors in [7]. For a more 
complete discussion of the conjecture and the reasons for being interested in it (and 
in particular, its applications to the generalised Hardy-Littlewood conjecture on the 
number of solutions to systems of linear equations in which the unknowns are all prime) 
the reader may refer to the introduction of [7], the first several sections of [S], or any of 
the expository articles [H El HU \W\ . 

In this paper we settle the Mobius and Nilsequence conjecture. In fact, we shall 
prove the marginally stronger result that the Mobius function is asymptotically strongly 
orthogonal to any polynomial nilsequence {F{g{n)T))nez- 

Theorem 1.1 (Main Theorem). Let G/T be a nilmanifold of some dmension m'^ 1, 
let G, be a filtratioru of G of som,e degree d ^ 1, and let g G poly(Z, G,) be a polynom,ial 
sequenc^. Suppose that G/T has a Q -rational MaVcev basi^ X for some Q ^ 2, defining 
a metric dx on G/T. Suppose that F : G/T — t- [—1, 1] is a Lipschitz function. Then we 
have the bound 

for any A > and N ^ 2. The implied constant is ineffective. 

Remarks. By specialising to the linear case g{n) := d"'h for some a,h E G (and using 
the existence of Q-rational Mal'cev bases, see [HI Proposition A. 9]), Theorem [TTT] im- 
mediately implies the Mobius and nilsequences conjecture [SI Conjecture 8.5]. In fact it 
gives a somewhat more precise result, since the dependence on Q and ||-F||Lip is given 
quite explicitly. For the application of Theorem 11.11 in [Sj, however, knowledge of these 
dependencies is not necessary. 

The ineffectivity of the bound in Theorem 11.11 already occurs for sufficiently large A in 
the 1-step case (which, as mentioned before, is essentially (II. ip ). and is ultimately due 
to the well-known ineffective bounds on Siegel zeroes. On the other hand, the remainder 
of the argument is effective, and so any effective bound for Siegel's theorem would imply 
effective bounds for Theorem 11.11 In particular, this would be the case if one assumed 
GRH. In fact, in that case it is not difficult to see from modifying the arguments below 



-"^In other words, G, = {Gi)f^Q where G ^ Gq <Z Gi C ... Gd is a descending sequence of Lie groups 
and [Gi,Gj] C Gj+j f^or all i,j ^ 0, with the convention that Gi is trivial for i > d; see Definition 
1.2]. 

A sequence g : Z ^- G lies in poly(Z, G,) if dh^ ■ • ■ dhiQ takes values in Gi for all hi, . . . ,hi £ Z and 
i > 0, where dhg{n) :— g{n + h)g{n)^^\ see P Definition 1.11] and the ensuing discussion. 

The notion of a Q-rational Mal'cev basis is defined in [9l Definition 2.6] and the construction of 
the metric dx is given in the same section. 
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that we can replace the logarithmic decay log" A^ by polynomial decay A^^^ for some 
c > depending only on d and m. 

The authors learnt in |9j that it is in many ways more natural to consider the class 
of polynomial sequences poly(Z, G,) rather than simply the class of linear sequences 
n ^-^■ a"x. This is ultimately due to the stability of the polynomial class under a wide 
variety of operations, such as pointwise multiplication. On the other hand, these two 
categories are certainly closely related (and are, in some sense, equivalent): see [I2j for 
further discussion. 

Acknowledgements. The first author is partly supported by a Leverhulme Prize. 
The second author is supported by a grant from the Macarthur Foundation and by NSF 
grant DMS-0649473. 

2. Reducing to the equidistributed case 

To prove Theorem ll.il we will apply [HI Theorem 1.19] to decompose (7 as a product 
eg''-^ where e is "smooth" , 7 is "rational" and g' is highly equidistributed in some closed 
subgroup G' C G. We will recall the precise statement shortly. 

In ths section we shall show how the rather harmless factors e and 7 in the above 
factorisation may be eliminated, and then make an additional reduction to the case 
JqiyF = (using the Haar measure on G/F, of course). This leaves us with the task 
of proving an "equidistributed" case of Theorem 11.11 see Proposition 12.11 below. 

For the rest of the paper, all constants c, C, including those in the asymptotic notation 
<^ and 0(), are allowed to depend on m and d. Different occurrences of the letters c, C 
may represent different constants; typically we will have 0<c^l^C<oo. For ease 
of notation we drop the subscript whenever Lipschitz norms are mentioned, so ||-F||Lip 
becomes simply ||-F||. 

Recall from |9, Definition 1.3(v)] that a sequence {g{n)T)n<^yN^ in a nilmanifold is 
totally 5 -equidistributed if we have 

\&nepF{g{n)T)\^5\\F\\ (2.1) 

for all Lipschitz functions F : G/T — t- C with J^ ,p F = and all arithmetic progressions 
P C [A^] of length at least 5N . 

In the next section we shall establish the following result about the lack of correlation 
of Mobius with equidistributed nilsequences. 

Proposition 2.1 (Mobius is orthogonal to equidistributed sequences). Let m ^ 0, 
d ^ 1 be integers and let N ^ 1 he an integer parameter which is sufficiently large 
depending on m and d. Let 6, < 6 < 1/2, and Q ^ 2 be real parameters. Let 
G/T be an m-dimensional nilmanifold, and suppose that G, is a filtration of degree d. 
Suppose that G/T has a Q -rational Mal'cev basis X adapted to the filtration G,. Let 
g e poly(Z, G,) and suppose that (fyf(n)r)„g[jv] is totally 5 -equidistributed. Then for any 
function F : G/T — )• M with j^,^ F = and for any arithmetic progression P C [N] of 
size at least N/Q, we have the bound 

\E^^[N]fi{n)lp{n)F{g{n)T)\ « 6'Q\\F\\ log AT. 
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The proof of Proposition 12.11 proceeds via the method of Type I/II sums, which is 
also known as the method of bihnear forms. This is the same method that one might 
use to tackle the "minor arcs" case of (11.11) . where a is not close to a rational with 
small denominator. We will describe it in detail in the next section. Our task for the 
remainder of this section is to reduce Theorem 11.11 to Proposition 12.11 

Proof that Proposition \2.1\ implies Theorem \l.l[ We start with a brief overview. The 
main ingredient of this argument is [HI Theorem 1.19], that is to say the factorization 
g = eg''-)' mentioned above. In addition to that we require estimates for sums of the type 
^■nelN]fJ'{n)lp{n), where P C [A^] is a progression. After standard harmonic analysis, 
such bounds ultimately depend on results about the zeros of L-functions L(s,x), and 
as such this is analysis of the same type as would be used to establish the "major arc" 
cases of fll.ip . Finally, a fair amount of what might be called "quantitative nil-linear 
algebra" is required to keep track of the various nilmanifolds and Lipschitz functions 
involved in the argument. Here we draw repeatedly on the material assembled in [HI 
Appendix A] for this purpose; we encourage the reader to gloss over these essentially 
routine issues on a first reading. 

We now turn to the details. We allow all implied constants to depend on m and d. 

Let the hypotheses be as in Theorem 11.11 To simplify the notation slightly we will 
also assume that ||F|| ^ 1; the case ||F|| < 1 can easily be deduced from that case. By 
dividing out by ||-F|| we may in fact normalize and assume that ||-F|| = 1. 

We may of course take A ^ 1. We may also assume that Q ^ log A^, since the claim 
is vacuously true otherwise; thus X is now a log A^-rational Mal'cev basis. By increasing 
A if necessary, it will suffice to show an estimate of the form 

|E„e[7V]/iHF((?(n)r)| «a log'^+^^i) N. (2.2) 

Let 5 be a parameter (depending on A) to be specified later. We may assume that A^ 
is sufficiently large depending on A,B. By [9l Theorem 1.19] (with Mq := logN) we 
can find an integer M, 

log AT ^ M < log°^(^) AT, 

a rational subgroup G" C G, a Mal'cev basis X' for G'/T' (where T' := G H P) in which 
each element is an M-rational combination (see [9[ Definition 1.21]) of the elements of 
X, and a decomposition 

9 = eg'^ (2.3) 

into polynomial sequences £,(?', 7 G poly(Z, G,) with the following properties: 

(i) e:Z^G, is (M, A^)-smooth (see P Definition 1.22] for a definition); 
(ii) (yf' : Z — 7- G' takes values in G', and the finite sequence ((7'(n)P')„g[7v] is totally 

M^^-equidistributed in G'/T', using the metric d;^/ on G'/T'; 
(iii) 7 : Z — )■ G is M-rational (see P Definition 1.21]), and (7(n)P)„g2 is periodic 

with period 1 ^ g ^ M. 

From (12. 3p we have 

E„e[^]/i(n)F((7(n)P) = Ene[N]Kn)F{e{n)g'{n)^{n)r). (2.4) 

The sequence (7(n)P)„gz is periodic with some period q, 1 ^ q ^ M. For each j = 
0, 1, . . . , g — 1 let 7j := {7(j)} be the fractional part of 7(j) with respect to P, thus 



THE MOBIUS FUNCTION IS STRONGLY ORTHOGONAL TO NILSEQUENCES 5 

■jjT = 7(j)r and all the coordinates ipxi^jj) lie in [0, 1). This construction is described 
in [HI Lemma A. 14]. 

Now by P Lemma A.12], the coordinates ipx{j{j)) lie in jpZ"^ for some M' < M'^'^'^K 
Since jj = 7(j)?7 for some rj with integer coordinates, it follows from [21 Lemma A. 3] 
that the coordinates ipxijj) are rationals with height <^ M'^^^\ 

We now take advantage of the periodicity of ■y{n)T to split the right-hand side of 

(1231) as 

g-i 

^EnelN]f^{n)ln=j{modq)F{e{n)g'{n)-fjT); (2.5) 

j=0 

By the right-invariance of d, the (M, A^)-smoothness of e (see [9l Definition 1.21]) and 
the 1-Lipschitz bound on F we see that 

\Fiein)g'inh^T) - F{eino)g'inh,T)\ ^ d;,(£(n)(7'(n)7„£(no)^'(n)7,) 

= dx{e{no),e{n)) 

^ log-^ N. 

whenever if |n — no I ^ ^^ . Hence if we split each progression n = j(mod q) into 
further progressions Pj^k for k = 0(M log A^), each having diameter at most ^^ , 
we see that (12. 5p is equal to 



J2^nelN]Kn)lp^^^{n)F{a,,kg'{n)^,T) + 0(log-^ AT). (2.6) 

Here each a^ ^ := e{noj±) for some noj,fc £ Pj,k', by the definition of what it means for 
e : Z -)■ G to be (M, A^)-smooth (i.e. [SJ Definition 1.21]), it follows that dx{aj^k, idc) ^ 
M and hence, by [9l Lemma A. 4], that 

|V^;t(a,,fc)|«M°«. (2.7) 

If A^ is sufficiently large depending on A and B then A^ ^ lOMlog A^ (say), and this 
partition of [A^] may be arranged in such a way that 

AT N 

■'' ^ 2gMlog^A^ ^ 2M2 1og^Ar' 



Since the number of j is at most M, and the number of k is at most M log A^, we 
thus see that to show (12. 2p it suffices by the triangle inequality to show that 

|E„e[^]/i(n)lp^.,(n)F(a,-,(7'(n)7,r)| «a M'^ log-^^^^^^) AT (2.8) 

for each j, k. 

Fix j, k. Write Hj := •yJ^C'yj and let Qj : 2, ^- Hj be the sequence defined by 
9ji^) '■— lj^9'{f^)lj- It is clear that each gj is a polynomial sequence with coefficients 
in the filtration [Hj), := 7j"^G',7j. 

Set Aj := F n i/j and define functions 

F,-fc:i7,/A, ^[-1,1] 

by the formula 

Fj^k{xJ^j) ■= F{aj^kljxT). 
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Then (12.81) can be rewritten as 

|E„e[jv]/^Hlp,,,(n)F,-fc((7,-(n)A,)| «^ M^^ log'^^+^W iV. (2.9) 

Suppose for the moment that Fj ^ were a constant function. Recall that Pj^k has 
common difference q ^ M. We may thus apply Proposition IA.2I (with A replaced by a 
sufficiently large exponent A' depending on A and B) to obtain the desired claim, since 
M <C log ^^^' N. Therefore we may subtract off the mean of Fj ^ and assume without 
loss of generality that Ju/f^Fj^k = 0. This may cause Fj^k to take values in [—2,2] 
rather than [—1, 1], but we can easily counter this trivial issue by dividing Fjj^ by two. 

In a moment we shall use Proposition 12 . ll to estimate the terms appearing here. Before 
doing that we record quantitative rationality properties of the nilmanifold Hj/Aj, as 
well as a Lipschitz bound on ||-Fjfc||. 

Claim. There is a Mal'cev basis 3^j for Hj/Aj adapted to the filtration (Hj), such 
that each yj is an iVf-^-rational combination of the Xj. With respect to the metric 
dyj on Hj/Aj induced by this basis, the polynomial sequence gj G poly(Z, (Hj),) is 
j^^-cB+o(i)_^Q^g^jjy equidistributed for some c > depending only on m, d, and we have 

||F,J| <:mow, 

II J)"- II ^- 

Proof. We shall apply suitable combinations of the lemmas in [HI Appendix A]. 
The existence of yj follows from Proposition A. 9 and Lemma A. 13 of [H] together 
with the fact that each 7j has rational coordinates with height M'-^^^\ Now the map 
X H^ F{aj^kljxT) on G /T has Lipschitz constant at most M'^^^' by [U Lemma A. 5] 
and the bounds \'4'x{cij,k)\A'4'x{lj)\ ^ M'^'^^\ The final statement of the claim, and 
the statement about the quantitative equidistribution oi Qj, now follow from [9l Lemma 
A.17]. D 

Let us now apply Proposition 12. II to (12. 9p . We apply the proposition with parameters 
(which we distinguish using tildes) as follows: G := Hj, T := Aj, G, := (Hj),, g := gj, 
X := yj, Q := MO(i), F := Fj^k and 6 := M-^^+^^^\ We quickly see that ([21D is 
bounded by O ( M~'^^^^^^' log ' •* A^ j . Choosing B sufficiently large depending on A, 
we obtain (12. 9p as claimed. D 

3. The equidistributed case: Type I and II sums 

In this section we establish Proposition 12.11 using Vinogradov's method of Type I and 
II sums in the form due to Vaughan ^16j. More precisely, we will use the following 
proposition. 

Proposition 3.1 (Method of Type I/II sums). Let f : N ^ C be a function with 
, ^ 1 such that 



\^N<n^2NfJ'{n)f{n)\ ^ e 
for some e > 0. Then one of the following statements holds: 

• (Type I sum is large) There exists an integer 1 ^ K ^ jsf'^/^ such that 

\^N/k<n.^2N/kfikw)\ > (e / log Nf^'^ (3.1) 

for > (£:/logA^)*^(^)i^ integers k such that K < k ^ 2K. 
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(Type II sum is large) There exist integers K, W with ^N^^^ ^ K ^ 4A^2/3 ^^^^ 
N/4: < KW ^ 4N, such that 



Ew^n.,n.'<2wf{kw)f{k'w)f{kw')f{k'w')\ > (e/logiV)^^. (3.2) 



'■'K<k,k%2K'^W^w,w'<2W 



Proof. This is ^ Proposition 4.2], specialised to the case U = V = N^^^, and with 
certain explicit exponents replaced by unspecified constants 0(1). D 



We now begin the proof of Proposition 12.11 As before we may normalise so that 
||F|| = 1. From this and the mean zero assumption, we see in particular that 

\F{x)\ ^ diam(G/r) < Q^^^) (3.3) 

for all X G G/T (the diameter bound here is [9], Lemma A. 16]). 

li 5 ^ l/N then by ([21]) we have \F{g{n)T)\ ^ S for all n G [A^], and the claim is 
trivial, so we may assume that 6 > l/N. By increasing S if necessary (and shrinking c) 
we thus see that we may assume that 

6 > N-" (3.4) 

for any fixed small constant cr > depending only on m, d. 

The basic idea, which will become clearer upon reading the details, is to make good use 
of the fact that one may test the quantitative equidistribution properties of a polynomial 
nilsequence on G/T by passing to the abelianisation (G'/r)ab, a phenomenon referred 
to in O Theorem 2.9] as the "quantitative Leibman Dichotomy" (cf. [I2])- The abelian 
issues that one must then deal with are of a very similar nature to those involved in 
dealing with exponential sums such as E„g[Ar]/i(n)e(p(n)), where p : R — >■ M/Z is an 
ordinary polynomial. Rather than quote results from the existing literature on this 
problem it is easier for us to invoke various lemmas from [9], which were stated and 
proved in a language which is helpful for the present paper. 

Let e := 5'^'^Q\ogN , for a constant Ci to be specified later. We may assume that 
£ < 1, otherwise the claim is trivial from (13. 3 p and the triangle inequality. In particular, 
we have 

Q,logA^^r"i 

and we will use these estimates frequently in the sequel to absorb any polynomial factors 
in Q or log A^ into a power of 5~'^'^ . 

Suppose for contradiction that Proposition 12.11 failed for these parameters. We then 
apply Proposition 13.11 with f{n) := lp{n)F{g{n)T) and e as above, concluding that 
either (13. ip or (13. 2p holds. We deal with these two cases in turn. 

The Type I case. Suppose that (13. ip holds. Thus there are 3> 5^'^^^^K values of 
k G {K, 2K] such that 

\EN/k<yj^2N/klp{kw)F{g{kw)r)\ > (5^(^i). 

Let / denote the common difference of P; since \P\ ^ N/Q, we must have 1 ^ I ^ Q. 
Splitting into progressions with common difference /, we see that for some 6(mod /) and 



»5«(-)^, (3.5) 
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for > 5^^^'^K values of fc e {K, 2K] we have 

N/k<w^2N/k 
■w=b{raoAl) 

Setting w = h + Iw', this may be rewritten as 

Y,F{9iHb + lw')T) 

where h C [^ — 1,^] is an interval. 

For each value of k for which this holds, consider the sequence gu : 1^ ^ G defined 
by Qkin) := g{kn) and also the sequence gk : I^ —^ G defined by gk{n) = g{k{h + In)). 
It follows from P Corollary 6.8] that gk^gk G poly(Z, G,). Now (13. 5p implies that 
{cjk{n)V)n^\^Nk\ f^ils to be ^'^'^'^^^-equidistributed in G/F, where N^ ~ N/kl. 

It follows from [3 Theorem 2.9] that there is a nontrivial horizontal character ip^ '■ 
G — )■ M/Z (i.e. a continuous homomorphism from G to M/Z which annihilates F) with 
magnitude \il)k\ ^ 5^^*^^^^ such that 

UkO~gk\\c^[N^] <5-^('^^\ 

Recall from [9l Definition 2.10] that the C°°[A^]-norm of a polynomial p : Z — t- R/Z 
expanded in binomial coefficients as 

p{n) = ao + aA\ ^ ^"'^(rf)' ^"^'^^ 

is defined by 

lbllc-'[Af] := sup N^\\aj\\^ii. 

By [9] Lemma 8.4] (specialised to the single-parameter case t = 1), there is some 
Qk < (^"^('^i) such that 

hk^k o 9k\\c^[N„] < S'^^^'K 

Pigeonholing in the possible choices of qkipk, we may find some ip with < {ipl <^ 

||z^o(;,||c,o.[^^^]«r«(^i) (3.7) 

for > 6^^^^'>K values of fc G (fsT, 2K]. 

Write 

^ o ^(n) = /3,n'^ + ■ ■ ■ + /3o. (3.8) 

Then 

^o^fc(n) = /3,fcV + --- + /3o. (3.9) 

We would like to use this and (13. 7p to conclude that the coefficients y(3j are close 
to being integer (or rational with small denominator). This will follow from a simple 
lemma. 

Lemma 3.2. Suppose that p : Z — )■ R/Z is a polynomial of the form p{n) = Pd^'^ + 
■ ■ ■ + /3o- Then there is some q ^ 1, q = 0(1), such that \\qf3j\\K/z ^ ^~"'lbllc°°[iv] for 
j = l,...,d. 
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Proof. Consider the representation (13.61) which is used to define the C°°[N]-noTm.. 
Observing that Pj can be written as a hnear combination oi aj, . . . ,ad with rational 
coefficients of height 0(1), the result follows upon clearing denominators. D 

From fl3.7p . fl3.9p and Lemma 13.21 we see that there is some q ^ 1, q = 0(1), such 
that 

\\qyf3,\W/z « rO(=^)(iV/K)-^- (3.10) 

for J = 1, 2, . . . , ci and for at least d'^^'^^'^K values of A; G {K, 2K]. 

Fix j, 1 ^ j ^ d. To pass from the j*'^ powers y to more general integers we shall 
need the following Waring- type result. 

Lemma 3.3. Let K ^ 1 be an integer, and suppose that S C [K] is a set of size aK . 
Suppose that t ^ 2^ + 1. Then '^j^t a^^K^ integers in the interval \tK^] can he written 
in the form k{ + ■ ■ ■ + kl, ki, . . . ,kt G S. 

Proof. It is a well-known consequence of Hardy and Littlewood's asymptotic formula 
for Waring's problem (see e.g. [T7]) that the number of solutions to 

x{-\ Vxl = M, xi, . . . xt e [K] 

is ^j^t K*~^ uniformly in M provided that t ^ 2-^ + 1. (In fact, by subsequent work, such 
a result is known for much smaller values of t when j is large.) Let X = {y : k & S} 
and let r{n) be the number of representations of n as the sum of t elements of X. Then 
by the Cauchy-Schwarz inequality and the preceding remarks we have 

^2t^2t ^ (^r{n)f ^ \tX\Y,r{nf «,■ \tX\K^'-\ 

n n 

which implies the result. D 

By (13.1 up and Lemma 13.31 it follows that 

\\qif3,\W/z<t:6~'^^''^\K/Ny 

for > d^^^^^K^ values of / G [WK^]. 

The following lemma, which is |9i Lemma 3.2], may be applied to this situation. 

Lemma 3.4 (Strongly recurrent linear functions are highly non-diophantine). Let a G 
M, < o" < 1/2, and < fi ^ cr/2, and let I C M/Z be an interval of length n 
such that an G / for at least aN values of n E [N] . Then there is some /c G Z with 
< |fc| < (T-^(^) such that ||A;a||iK/z < fia^^'^^^/N. D 



Let us attempt to apply this lemma with a > 6'^'-''^^ and /i < S'^^^^^K/Xy . If A^ 
is sufficiently large and the exponent a in (13. 4p is sufficiently small, we see using the 
bound K/X ^ X~^^^ that the hypotheses of the lemma are satisfied and that such an 
application is permissible. The conclusion is that there is some q', 1 ^ g' <^ 5~^^'^^\ 
such that 

\\qq'P^\\R/z « r^(^^)iV-\ (3.11) 



Writing ip := qq'ip, it follows from (13. 8 p and (13. lip that for any n we have the bound 

||^o^(n)||K/^«r°(^^)n/iV. 



10 BEN GREEN AND TERENCE TAG 

If A^' := 6^'^^N for some sufficiently large C, and if n G [A^'], this implies that 

Uog{n)\\u/z^l/lO. (3.12) 

Now set F : G/T -^ [-1, 1] to be the function F := r/ o ^, where r] : M/Z -^ [-1, 1] 
is a function of Lipschitz norm 0(1) and mean zero which equals 1 on [—1/10, 1/10]. 
Then we have Jg,,p F = and ||-F|| <^ 6^'^^'^^\ From (13.121) . we have 

\Ene[N']F{g{n)T)\;,l>6\\Fl 

provided that Ci is chosen sufficiently small. This is contrary to the assumption that 
{g{n)T)ne[N] is 5-totally equidistributed. 

The Type II case. This is in many ways very closely similar to the Type I case, as the 
reader will see. Recall the situation that (13. 2 p puts us in (with our choice of e): there 
are K, W with ^N^/^ ^K ^ AN'^/^ and N/A ^ KW ^ AN such that 

\^K<k,k'^2K^W<v.,^'<.2wf{kw)f{kw')f{k'w)f{k'w')\ » 5''^'''\ 

where f{n) = lp{n)F{g{n)r). Writing the left-hand side here as 

^K<k,k%2K\^W<w^2wf{kw)f{k'w)\^, 

we see that there are :» 6'^'^'^^^K'^ pairs {k, k') G (iC, 2KY such that 

\¥.w<.n^2wf{kw)f{k'w)\:^5''^'''\ 

Written out in full, for each such pair (/c, k') we have 

\^w<^^2wlp{kw)lp{k'w)F{g{kw)T)F{g{k'w)T)\ » {e/\ogNf^^\ 

Writing / for the common difference of P (thus 1 ^ / ^ Q) we see that there is some 
6(mod /) such that for ^ {e/\ogN)^^^' K"^ pairs {k,k') we have 

J2 lp{kw)lp{k'w)F{g{kw)T)F{g{k'w)T)\ > (5^(^i^^. 

W<w!^2W 
w=b{modl) 

Setting w = Iw' + b, this may be written as 

I Yl F{g{k{b + lw')T)F{g{k\b + lw'))T)\:^6'^^''^^, (3.13) 

where Ik^k' ^ (7- ~ Ij ^] is an interval. Since 1 ^ I ^ Q, which is bounded by a small 
power of A^, and W ^ iV^/^, this is contained in [^, ^]. 

For each k, k' for which this holds, consider the sequence gk^k' : Z — t- G x G defined 
by gk,k'{i^) = {g{kn),g{k'n)), and also the sequence c/k^k' : Z — )■ G x G defined by 
gk,k'{n) = {g{k{b + ln),g{k'{b + In))). It follows from P Corollary 6.8] that gk,k',gk,k' ^ 
poly(Z, G, X G,). Now from (I3.13P we see that the sequence {gk,k'{n)(T x r))ne[iVj, ^z] 
fails to be ^^('^i^-equidistributed in (G/T) x (G/F), for some Nk,k' e [f , ^]. 

It follows from [9l Theorem 2.9] that there is a nontrivial horizontal character ipk.k' '■ 
GxG ^ M/Z with \^k\ '^ 5-^(^1) such that 

\\i'k,k' ° gk,k'\\c°-[N^,k,] < 5'^^"'^. 
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By [SI Lemma 8.4] there is some qk,k' ^ (5^'^('=i) such that 

Pigeonhohng in the possible choices of qk,k''4'k,k' ■, "we may find some ip with Q <\il)\ <^ 

||^^o^,_,,||t,o.[^^^^,]«r^(^^) (3.14) 

for > 5^(^i)ir2 pairs k, k' G {K, 2K]. 

Write ip = ipi(Bip2, where iIji,iIj2 '■ G ^)- R/Z are horizontal characters, not both zero. 
If 

ifji o g{n) = Pdu'^ + ■ ■ ■ + /3o 
and 

then 

^ o gk,k'{n) = {Pdk'' + P'dk"')n'' + . . . + (/3o + /?;), 
By Lemma [3.21 and (13.141) there is some 1 ^ g ^ ^-o(ci) ^^^^ ^j-^g^^^ 

Uk^p, + 0(5'^)\W/^ « r^^'^^^iv-^, « r«(^^)(ir/iv)^- 

for j = 1, 2, . . . , rf and for > 5^^^^^K^ pairs A;, k' G (iT, 2K]. 

Suppose, without loss of generality, that iIji ^ 0. Selecting some k' that occurs in 
^ ^o(ci)^ q£ ^]^g pairs /c, k' and subtracting, we see that 

\\qy(5,\W,z<^5-^^''\K/Ny (3.15) 

for > 5°^''^'>K values of A; € (-/T, iT). Using the bounds K > N^''-^ and ([S3D it follows 
that we may ignore the contribution of A; = 0, that is to say (I3.15P holds for ^ 5'^^'^^^K 
values of A; G [1, -ft']. 

Remark. Note carefully that (I3.15P carries no information when k = Q. In our 
treatment of Type I sums there was no need for a lower bound on K, but such an 
assumption is essential if one has any desire to bound Type II sums. 

The estimate (I3.15P is identical to (I3.10p . We may now repeat the arguments used to 
obtain a contradiction to (I3.10p in Type I case. The proof of Proposition 12.11 and thus 
Theorem 11.11 is now complete. D 

The main business of the paper is now complete. In the next section we give a brief 
discussion of how our argument compares with the classical Hardy-Littlewood method. 
After that we give a number of applications of Theorem II. 1[ 



4. Remarks on a nilpotent Hardy-Littlewood method 

It may be of interest to interpret our method in terms of the "major and minor 
arcs" terminology of the Hardy-Littlewood method. Recall that to prove Davenport's 
estimate 

|Ene[Ar]/i('^)e(an)| <^ log^^A^ 

one divides into two cases: the major arcs where a is close to a rational with small 
denominator, and the minor arcs where it is not. The major arcs are handled using 
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L-function technology as in Appendix [XJ and the minor arcs are handled using Type 
I/II sums as in Proposition 13.11 

Suppose that we are considering the sum 

Ene[N]Kn)F{g{n)T), 

where J„ ,p F = 0. Decompose g as a product eg''j where e is smooth, 7 is rational 
and g' is highly equidistributed on some subgroup G' . Then one might think of 5^ as a 
"major arc" nilsequence if G' = {idc}, and as "minor arc" if G' is nontrivial. 

To justify this terminology, observe that one may interpret e{an) as F{g{n)r), where 
G/r = M/Z, (yf : Z — 7- R is the polynomial sequence g{n) = an and the Lipschitz 
function F, taking values in the unit ball of the complex plane, is simply e{6). 

li a = - + e, where e is small, then the decomposition g = eg''-) will be given by 
e{n) = en, g'{n) = id^ and •y^n) = an/q and so this does indeed correspond to a "major 
arc nilsequence" . 

If a is not close to a rational with small denominator then g{n) will already be highly 
equidistributed on M/Z, and so the decomposition g = eg''-)' has e = 7 = idc and g' = g. 
Thus G" = R is nontrivial and this corresponds to a "minor arc nilsequence" . 

5. On bracket pglynomials 

By a bracket poly-nomial we mean an object formed from the scalar field R and the 
indeterminate n using finitely many instances of the standard arithmetic operations +, 
X together with the integer part operation [ J and the fractional part operation { }. The 
following are all bracket polynomials: n^+n\/2, n\/2\n\/?>\ and {n^ \/2+n'^ \n\/b\+\/l} . 
One may associate a notion of complexit'y to any bracket polynomial p(n), this being 
(for instance) the least number of operations +, x , [ J , { } required to write down p. In 
view of the relation {x}+ [a;J = x, it is not strictly speaking necessary to retain both the 
integer and fractional part operations, but we do so here for convenience. Dispensing 
with one of them would slightly alter the definition of complexity. 

The following remarkable theorem of Bergelson and Leibman [2] demonstrates a close 
link between bracket polynomials and nilmanifolds. If G/T is a nilmanifold with Mal'cev 
basis X then recall from [9] Lemma A. 14] that the coordinate map if) : G ^ R™ 
provides an identification between G /V and [0, 1)™'. Write ri, . . . , r^ for the individual 
coordinate maps from G/T to [0, 1), that is to say Tj is the composition of %p with the 
map (ti,...,tm) ^ U. 

Theorem 5.1 (Bergelson-Leibman). The functions of the form n i— t- {p{n)}, where p 
is a bracket pol'ynomial, coincide vuith the functions of the form n 1— t- Ti{g{n)r), where 
G/T is a nilmanifold equipped with a Mal'cev basis X and g : X ^- G is a pol'ynomial 
map with coefficients in some filtration G,. The rationality of X , the dimension of G, 
the degree of g and the rationality of G, may all be bounded in terms of the complexity 
ofp, and conversely the complexity of p may be bounded in terms of these quantities. 

In fact, Bergelson and Leibman prove a number of rather refined variants of this 
type of result, and they also give a comprehensive and edifying discussion of bracket 
polynomials in general. At first glance it appears that one might immediately combine 
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Theorem 15.11 with Theorem 11.11 to obtain a resuh about the correlation of the Mobius 
function with bracket polynomials. There is a serious catch, however: the coordinate 
functions Tj are not continuous on the nilmanifold G/T. Furthermore, as observed by 
Bergelson and Leibman, there are bracket polynomials which cannot be written in the 
form F{g{n)T) for a continuous F. Indeed the results of Leibman []2] on the distribution 
of {g{n)V)n^i imply that the sequence (F((yf(n)r))„gz cannot have isolated values, yet 
there are bracket polynomials which do. A simple example is [1 — {nv2}J, which is 
zero except when n = 0. 

One does nonetheless feel that the discontinuities of Xj are "mild" , as this function is 
continuous on that part of G/T which is identified with (0, 1)™. However, the sequence 
{g{n)V)n^z may well concentrate on a highly singular subset of G/F, as we discussed at 
length in [9] . Thus a certain amount of further work is required to obtain the expected 
result, which is the following. 

Theorem 5.2 (Mobius and bracket polynomials). Suppose that p{n) is a bracket poly- 
nomial and that \l/ : [0, 1] — )■ [—1, 1] is a Lipschitz function. Then we have the estimate 

E„e[^]/i(n)^({p(n)}) <a,* log"^ A^, 

where the implied constant depends only on A, "$ and the complexity of p {but is inef- 
fective). 

We shall illustrate how this theorem may be deduced from Theorem 1 1.1 1 by discussing 
two related special cases. We will then sketch the details that are required in order to 
write down a complete proof. The authors plan to include a complete proof of Theorem 
15.21 in a future publication. 

Both special cases will take place on the Heisenberg nilmanifold G/T, where 

G=oiR,F=oiz. 
Vooi/ Vooi/ 

Computations with Mal'cev bases in this setting were given in [71 Appendix B] and then 
again in [9l §5], where we took 



ei =exp(Xi) = (oio) ,62 = exp(X2) = (oii) ,63 = exp(X3) = (^ 



101 
010 
001 



We briefly recall some of the computations carried out in somewhat more detail in that 
paper; in any case the proofs are nothing more than computations with 3x3 matrices. 
The coordinate function ■?/' : G — !■ M^ is then given by the formula 



^ ((ooD) " (^'^'^~^^)' 



and the element written here is equivalent, under right multiplication by an element of 
F, to the element with coordinates 

Note that this lies inside the fundamental domain [0, 1)^. It follows that, for any a, P E 
M, we have 

{nl3[na\} = T3{g{n)T), 
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where T3 : G/T — )■ [0, 1) is the map into the third coordinate and (7 : Z — )■ G is the 
polynomial sequence given by 

/I na ri^afi 

g{n) =01 n/3 
Vo 1 

This is an explicit example of the representation of a bracket polynomial, in this case 
{n/3[naj}, in the form discussed in Bergelson and Leibman's theorem. 

We discuss two different cases. 

Case 1. a = \/2, (3 = \/3. Then the sequence {g{n)T)n^[N] is totally A^~'^-equidistrib- 
uted on G/T, which makes life rather easy. To prove the equidistribution one may use 
[9l Theorem 2.9] together with the lower bound 

min \\k1V2 + k2V3\\u/z :^ K'^ , 

(fcl,fc2,fc3)7^(0,0,0) 

which follows from the fact that, for any ^3 with |A;3| ^K^ ki\/2 + k2\/3 + k^ satisfies a 
quartic over Z with coefficients of size K^^^\ Although the function T3 is not continuous, 
it is continuous outside of a subset of G/T of measure zero, namely outside of [0, 1)'^ \ 
(0, 1)^. This means that it may be approximated by Lipschitz functions. More precisely, 
for any fixed Lipschitz function \E' : [0, 1] — )■ [—1, 1] and any e > one may find functions 
Fi,F2 : G/r ^ C with llFilU, IIF2II00 ^ 1, Ili^illLip, ||F2||Lip ^ e~^^^\ |$ 0x3 - Fi| ^ F2 
pointwise and J^,-^ F2 ^ e. From Proposition (12. ip we have 

and the uniform distribution of {g{n)T)n(:[N] implies that 

E„e[^]F2((7(n)r) ^ 6 + 0(£-°(i)iV-^). 

Now we have the bounds 

\Ene[N]Kn)^inV3[nV2\)\ = \E^^[N]Kn)^ o T3{g{n)T)\ 

^ \Ene[N]Kn)Fiig{n)r)\+Ene[N]F2{gin)r). 

Letting e = A^~'^ for some sufficiently small c' > 0, we obtain an effective and much 
stronger version of Theorem 15.21 in this case, namely the bound 

E„e[^]/i(n)^({ny3[ny2j}) < A^-^ 

Case 2. a = /3 = a/2. Now the sequence {g{n)T)n^[N] is manifestly not uniformly 
distributed on G/T. In fact g takes values in the one- dimensional subgroup G' (^ G 
defined by 

G' = {(lrT):xeM}. 

Vo 1 / 
The preceding argument breaks down. One could appeal to Theorem 11.11 instead of 
Proposition 12. H but the problem comes when one tries to control the term 

Without knowing something more about the relation between the support properties of 
F2 and the orbit {g{n)T)n^[iy^, it is not possible to control this term. 
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In the case at hand {g{n)r)n£[N] is A^~^-equidistributed in the nilmanifold G'/V where 
r' := r n G. Topologlcally and algebraically this nilmanifold is nothing more that M/Z, 
but one should note carefully that the Haar measure on this nilmanifold is not the same 
as the measure induced from the Haar measure on G. This may be used to "explain" 
the observation that ny2\ny2\ is not uniformly distributed modulo one; see |2] for 
further details. 

Inside G/F, G'/V may be identified with the union of two segments 

{ 01 X :0^x<l}U{ 01 X :0^x<l}, 

Vooi/ Vooi/ 

and this makes it clear that the induced map ra : G'/V — )■ [0, 1) is continuous away 
from a single point. By an analysis very similar to the preceding one it may once again 
be shown that 



E„e[^]/^(n)^({nV2[nv^J}) < iV-'= 
for any fixed Lipschitz function ^ : [0, 1] — ?■ [—1,1]. 

Amongst examples of the form nl3\na\ there is a third distinct case, typified by 
a = (3 = 2^/^. We leave the analysis of this to the reader. 

Sketch proof of the general case of Theorem I5.M By Theorem 15. 1[ the result of 
Bergelson and Leibman, it suffices to show, for any fixed Lipschitz function \E' : [0, 1] — )■ 
[-1,1], that 

E„e[7v]/i(^)(^ o ri){g{n)T) <€.a log'^N. 

Here, the notation and parameters are as described in Theorem 15. II Now Xj is continuous 
outside the set [0, 1)"^ \ (0, 1)"*, which has zero measure in G/T . The issue lies in 
understanding how the orbit {g{n)V)n^[N] interacts with this. 

Now the main results of [S] allow us to get a handle on this situation. Consider in 
particular the decomposition of g as eg'"^ which was obtained in [9, Theorem 1.19]. 
Recall that e : Z — )■ G is slowly varying, 7 : Z — )■ G is rational and (7' : Z — )■ G' is 
such that {g'{n)V)ne[N] is totally equidistributed. For a full proof of Theorem 15.21 one 
would naturally need to specify appropriate quantitative parameters here. Suppose for 
simplicity that e = 7 = idc (this was, in fact, the case in the two examples above). 

Choose a Mal'cev basis for G'/V with coordinate map ip' : G' ^f M™ . Then G'/V 
may be identified with the region '?/''~^([0, 1)™ ) C G, and in this way we think of the 
coordinate function Tj as a function on G'/V. Write fj for the corresponding function 
on [0, 1)™ . It can be shown, making extensive use of the results of P, Appendix A], 
that fj is continuous outside of a piecewise polynomial set of positive codimension, that 
is to say outside of a finite union of sets each of which is defined by some polynomial 
inequalities a ^ P{ti, . . . ,tm') < b and at least one nontrivial polynomial equation 
Q{ti, . . . ,tm') = c. Related matters are discussed at greater length in [2]; in the two 
examples we discussed, these piecewise polynomial sets were rather simple. These sets 
are certainly well-behaved enough that Tj may be approximated using Lipschitz functions 
Fi and F2 as in our treatment of the bracket polynomial n\/3\n\/2\, and in this way 
one may use Theorem 11.11 to obtain the desired bound 

E„e[jv]/i(ri)(^ o ri){g'{n)T) <a log"^ N. 
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If G' 7^ {id} then one may in fact use Proposition 12.11 to obtain the stronger bound of 
N~'^, as in the examples. 

If e and 7 are not trivial it is even more complicated to write down a fully rigorous 
argument, but conceptually things are not much harder at all. The introduction of the 
smooth function e{n) has a rather benign effect; if n ranges over an interval of length 
5'N , for suitably small 5' = S'{6), the discontinuities of the functions x 1— )■ Ti{e{n)xr) 
are all contained inside a "nice" set of measure at most 6, and one may proceed much 
as before. All one need do, then, is split the range [N] into suitably short intervals of 
this type. 



The introduction of 7 may be handled much as it was in the proof of Theorem 11.11 
One splits each of the intervals from the previous paragraph into progressions Pj with 
the same (small) common difference q such that 7(^)r is constant and equal to 7jT on 
P. One then works with the conjugated sequences 'yj^g'{n)'yj as we did at the end of 

m □ 



We conclude by remarking on some variants and generalizations of Theorem 15.21 If 
Pi, . . . ,pm are bracket polynomials and F : (M/Z)^^ — > C is a smooth function then one 
could establish the estimate 

E„6[^];u(n)F({pi(n)}, . . . , {pMin)}) <a log~^ A^ 

by Fourier decomposition of F and Theorem 15. 2[ One could, if desired, restrict the 
range of the average to some fixed subprogression P C [A^] by the standard technique 
of approximating the cutoff lp(n) by a smoother function lp(n) and then developing 
this as a Fourier expansion. 



6. The Liguville functign 

Everything we have proved for the Mobius function also holds for the Liouville func- 
tion A:N— !■{ — 1,1}, defined to be the unique completely multiplicative function such 
that \{p) = —1 for all primes p. This function is related to the Mobius function via the 
identity 



r:r'^\n 



P) 



Thus, with the notation and assumptions of Theorem II. H we have 



X{n)F{g{n)T)\<^ ^ ^\E„,^[N/r2^fiim)F{gir^m)T)\. 






Now by [9'j Corollary 6.8] m H- g{r'^m) is a polynomial sequence with coefficients in the 
same ffitration G, a.s g, and so we have the bound 

|E„e[7vM/^MF((7(r2m)r)| <„,,,AQ^-'^'^(^ni+ ll^llLip)log-^(Ar/r2) 
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uniformly in r, so long as N/r"^ ^ 2. Summing over r we obtain 

|E„e[7V]A(n)F(^(n)r)| «^,,,^ Q0^-^^-^^){1 + ||F||Lip)( Y. \^og-\N/r^) 



mVN/2 

+ E ^) 

This is precisely Theorem ll.H but with A taking the place of /x. In a similar fashion, 
all of the results of the preceding section concerning bracket polynomials may now also 
be deduced with A in place of /x. 

7. A RECURRENCE RESULT ALONG THE PRIMES 

In this section we derive the following result. Here Pi,P2,P3, • • • is the sequence of 
primes. 

Theorem 7.1 (Prime return times on a nilmanifold). Suppose that G/T is a nilmanifold 
and that g E G is such that left-multiplication by g is ergodic. Then for every x G G/T 
the sequence {g'^"xV)n=i,2,... is equidistributed in G/T in the sense that 



lim En^iN]F{gP-xT) = [ F 
for all continuous functions F : G/T — > [—1, 1]. 



Remarks. We recall (from discussions in the companion paper [9]) Leon Green's 
criterion for ergodicity of left-multiplication by g\ this map is ergodic if and only if 
rotation by 'n:{g) is ergodic on the horizontal torus (G'/r)ab, that is to say if and only if 
the entries of 'i^{g) together with 1 are linearly independent over Q. If this is the case 
then left-multiplication by any power of g is uniquely ergodic, that is to say 



lim E„e[^]F((7*"xr) = / F (7.1) 

for all X G G/T and for t = 1, 2, 3, ... . 

Proof of Theorem \7.1\ Let w he a large number and set W := np<ioP- Fix a 



nilmanifold G/T and a continuous (and hence Lipschitz) function F : G/T — )■ [—1,1]. 
Then uniformly in the residues b coprime to W we have 

lim E„e[jv](4P^'(^^ + ^) - l)^(^"^r) = o^_oo(l), (7.2) 

where the convergence is uniform in a; G G/T and g E G. This follows very quickly 
from [HI Proposition 10.2], which was proved under the assumption of the Mobius and 
Nilsequences conjectures MN(s) which we have established in this paper. Recall that 
A'(p) = logp and that A'(n) = if ra is not a prime, that is to say A' is a modified version 
of the von Mangoldt function with no support on the prime powers p'^,p^, . . . . We recall 
that the proof of (17. 2p is quite substantial. One splits the von Mangoldt function A 
in a certain way as the sum of two pieces A" -|- A''. The contribution from the second 
piece is bounded using the MN(s) conjecture, and this is not particularly difficult. The 
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contribution from the first piece is bounded using the machinery of Gowers norms, and 
here one must estimate the dual Gowers norm of the nilsequence F{g^xr) as well as the 
Gowers norm of objects related to A^. This is a substantial amount of work. 

Let us return to the proof at hand. Since f l7.2p is uniform in g and x, we may replace 
g by g^ and x by g'^x to get 

lim E„gp,^(^A'(n) - l)F(^"xr) = o^^U^) 

uniformly for all progressions Pb,w = {Wn + b : n E [N]}, 6 = 0,1,. ..,14^ — 1. However 
it follows from (17. ip that, for fixed b and W, 



lim EneP,^F{g''xT)= f F. 



Comparing these last two expressions we obtain 



lim E„ep,^A'(n)F((7"xr)= / F + o^^^il), 



VV W— >oo J C IT 

uniformly for h coprime to W . Now if h is not coprime to W we obviously have 

^^ lim E„en^A'(n)F((?"a;r) = o^^oo(l) 

since A' is supported on the primes and F is bounded by 1. 
Summing over b, one may conclude that 



lim E,e[i^^]A'HF((7"xr) = / F + o^^^il). 

This is easily seen to imply that 

lim E„e[^]A'(n)F(^"xr) = [ F + o^^^{l). 

The left-hand side no longer depends on tu, so we may let tu — )■ oo. Doing so, we obtain 

lim E„e[^]A'(n)F((7"xr) = f F 

JV^oo Jc,/P 

An easy argument using the prime number theorem, noting that A'(p„) is essentially 
logA^ for almost all primes J9„, n ^ N, concludes the proof. D 

Very straightforward approximation arguments allow one to replace the continuous 
function F by a function with mild discontinuities. In this way one could prove, for 
example, that the sequence p„v^[p„a/2J is uniformly distributed modulo one. We leave 
the details, which are essentially all present in the earlier discussion of n\/3[n\/2\, to 
the reader. 

Appendix A. Mobius and periodic functions 

In this appendix we give the proof of Proposition IA.2[ The argument is, quite apart 
from being completely standard, already contained in ^ Chapter 3]. We nonetheless 
take the opportunity to recall it here, as we wish to emphasise the fact that the main 
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input to this part of the argument is information on the zeros of L-functions. Our 
starting point is the following proposition. 

Proposition A.l. For any A> Q we have 



^ne[N]^l{n)x{n) <a q^'^ log~^ N (A.l) 

for all Dirichlet characters x to modulus q. 

Remark. This follows from the nonexistence of zeros of L(s, x) close to the line 3?s = 1. 
For the details, see [101 Prop. 5.29]. As noted in [lOl p. 124] there are difficulties involved 
in applying the standard Perron's formula approach to E„g[jv]/i(n)x(n) directly, and it 
is rather easier to first obtain bounds on E„g[jv]A(n)x(^)- 

Using standard techniques of harmonic analysis we may obtain the following conse- 
quence of Proposition lA.ll 

Proposition A. 2 (Mobius is orthogonal to periodic sequences). Let f : N ^ C be a 

sequence bounded in magnitude by 1 which is periodic of some period g ^ 1. Then we 
have 

E„e[7v]/i(r^)7H <A glog-^ A 
for all A > 0, where the implied constant is ineffective. 

Proof. We first establish the estimate under the additional assumption that f{n) van- 
ishes whenever {n,q) ^ 1. Then / can be viewed as a function on the multiplicative 
group (Z/gZ) ^ , and thus has a Fourier expansion 

/H = X^/(x)x(^), where /(x) := E„g(a/g2;)x/(n)x(n), 

X 

with X ranging over all the characters on {X/qL) ^ . Applying Proposition lA.ll and the 
triangle inequality, we conclude 

E„e[iv]/iW7R«Ag'/'log"^iV(^|/(x)|). 

X 

But from Cauchy-Schwarz and Plancherel we have 

X X 

where 0(g) := |(Z/gZ)^| is the Euler totient function. Since 0(g) ^ g, the claim follows. 

Now we consider the general case, in which (n, g) is not necessarily equal to 1 on 
the support of /. Observe that if fi{n) is non-zero, then n is square-free, and we can 
split n = dm, where d = {n,q) is square-free (so /i^((i) = 1) and m is coprime to g. 
Furthermore we have fi{n) = fi{d)fi{m). We thus obtain the decomposition 

En^lN]t^{n)f{n) = — Y^ fi{d) ^ fi{m)f{dm)l^rn,q)=i- (A.2) 

The sequence m i— )■ f{dm)l(^rn,q)=i is periodic of period q/d and vanishes whenever 
(m, q/d) 7^ 1, hence by the preceding arguments 

E /i(m) /(rfm) l(„,g)=i <A ^log"^ A. 
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Thus from flA.2p we have 

d\q 

concluding the proof of Proposition IA.2I D 
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THE MOBIUS FUNCTION IS STRONGLY ORTHOGONAL TO 

NILSEQUENCES 

BEN GREEN AND TERENCE TAG 



Abstract. We show that the Mobius function /x(n) is strongly asymptotically or- 
thogonal to any polynomial nilsequence {F{g{n)T))ni£n- Here, G is a simply-connected 
nilpotent Lie group with a discrete and cocompact subgroup F (so G/T is a nilmani- 
^— V ' fold), g : Z —>■ G is a polynomial sequence and F : G/T — >■ M is a Lipschitz function. 

CN ■ More precisely, we show that \jfY.n=i M("-)^(ff("-)r)| <F,G,r,A log"^ N for all A> 0. 

\^ I In particular, this implies the Mobius and Nilsequence conjecture MN(s) from our ear- 

MHi lier paper [8] for every positive integer s. This is one of two major ingredients in 

■^^ ' our programme in [SI to establish a large number of cases of the generalised Hardy- 

vQ . Littlewood conjecture, which predicts how often a collection V'l , • ■ • , V't • ^"^ ^^ ^ of 

^vl I linear forms all take prime values. The proof is a relatively quick application of the 

results in our recent companion paper fSj. 

We give some applications of our main theorem. We show, for example, that the 
Mobius function is uncorrelated with any bracket polynomial such as n^/3[n^/2\ . We 
also obtain a result about the distribution of nilsequences (a"xr)„gN as n ranges only 



(~| . over the primes. 



^ ' 1. Introduction 

> 

en . Important remark. This paper is intimately tied to, and is intended to be read 

l^^ I in conjunction with, the longer companion paper [9j, which proves results about the 

distribution of finite polynomial orbits on nilmanifolds. In particular, we shall make 

Q . heavy use of the notation and lemmas from that paper. 

00 ' 

Q I The aim of this paper is to establish what the authors have been referring to as the 

Mobius and Nilsequence conjecture MN(s), first stated as [HI Conjecture 8.5]. Roughly 

speaking, this states that the Mobius function fi{n), defined as (—1)'^ when n is the 

product of k distinct primes, and otherwise, is asymptotically strongly orthogonal to 



X 



c3 I any Lipschitz s-step nilsequence (F(a"a;))„gz, in the sense that the inner product 

EnG[JV]/^H^(a"a;) 

of these two functions on [N] := {1,...,A^} decays to zero faster than any fixed 
power of 1/logiV. Here and in the sequel we use the averaging notation 'Kx&xf{x) '■ = 
ITT Sxex /(•^) ^^^ ^^y finite set X. Recall also that an Lipschitz s-step nilsequence is 
any sequence of the form F{a'^x), where a is an element of an s-step connected and sim- 
ply connected nilpotent Lie group G, x is an element of the nilmanifold G/T for some 
discrete cocompact subgroup F ^ G of G, and F : G/T — )■ M is a Lipschitz function. 

The difficulty of this conjecture increases with s. The case s = of this conjecture is 
the estimate 

1 
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The stronger estimate 

E„6[jv]/i(n) « e-^v^^^ 
is essentially equivalent to the prime number theorem (with classical error term). 

The case s = 1 may be reduced by Fourier analysis to the estimate 

|E„6[^]Ai(n)e(an)| <^ log'^AT (1.1) 

where e{x) := e^'^*^, required to hold uniformly for all a G M. This was established 
by Davenport [3] in the 1930s by modifying Vinogradov's method of bilinear forms (or 
"Type I and Type II sums"). 

In the case s = 2 the conjecture was established by the authors in [7]. For a more 
complete discussion of the conjecture and the reasons for being interested in it (and 
in particular, its applications to the generalised Hardy-Littlewood conjecture on the 
number of solutions to systems of linear equations in which the unknowns are all prime) 
the reader may refer to the introduction of [7], the first several sections of [S], or any of 
the expository articles [H El HU \W\ . 

In this paper we settle the Mobius and Nilsequence conjecture. In fact, we shall 
prove the marginally stronger result that the Mobius function is asymptotically strongly 
orthogonal to any polynomial nilsequence {F{g{n)T))nez- 

Theorem 1.1 (Main Theorem). Let G/T be a nilmanifold of some dmension m'^ 1, 
let G, be a filtratioru of G of som,e degree d ^ 1, and let g G poly(Z, G,) be a polynom,ial 
sequenc^. Suppose that G/T has a Q -rational MaVcev basi^ X for some Q ^ 2, defining 
a metric dx on G/T. Suppose that F : G/T — t- [—1, 1] is a Lipschitz function. Then we 
have the bound 

for any A > and N ^ 2. The implied constant is ineffective. 

Remarks. By specialising to the linear case g{n) := d"'h for some a,h E G (and using 
the existence of Q-rational Mal'cev bases, see [HI Proposition A. 9]), Theorem [TTT] im- 
mediately implies the Mobius and nilsequences conjecture [SI Conjecture 8.5]. In fact it 
gives a somewhat more precise result, since the dependence on Q and ||-F||Lip is given 
quite explicitly. For the application of Theorem 11.11 in [Sj, however, knowledge of these 
dependencies is not necessary. 

The ineffectivity of the bound in Theorem 11.11 already occurs for sufficiently large A in 
the 1-step case (which, as mentioned before, is essentially (II. ip ). and is ultimately due 
to the well-known ineffective bounds on Siegel zeroes. On the other hand, the remainder 
of the argument is effective, and so any effective bound for Siegel's theorem would imply 
effective bounds for Theorem 11.11 In particular, this would be the case if one assumed 
GRH. In fact, in that case it is not difficult to see from modifying the arguments below 



In other words, G, ~ iGi)i=o where G = Go ^ Gi ^ ... D Gd is a descending sequence of Lie 
groups and [Gi,Gj] C Gi+j for aU i,j ^ 0, with the convention that Gi is trivial for i > d; see [HI 
Definition 1.2]. 

A sequence g : Z ^>- G hes in poly(Z, G.) if dh^ ■ ■ ■ du^g takes values in Gi for all hi, . . . ,hi £ Z and 
i>0, where dhg{n) :— g{n + h)g{n)^^; see Definition 1.11] and the ensuing discussion. 

The notion of a Q-rational Mal'cev basis is defined in [9l Definition 2.6] and the construction of 
the metric dx is given in the same section. 
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that we can replace the logarithmic decay log" A^ by polynomial decay A^^^ for some 
c > depending only on d and m. 

The authors learnt in |9j that it is in many ways more natural to consider the class 
of polynomial sequences poly(Z, G,) rather than simply the class of linear sequences 
n ^-^■ a"x. This is ultimately due to the stability of the polynomial class under a wide 
variety of operations, such as pointwise multiplication. On the other hand, these two 
categories are certainly closely related (and are, in some sense, equivalent): see [I2j for 
further discussion. 

Acknowledgements. The first author is partly supported by a Leverhulme Prize. 
The second author is supported by a grant from the Macarthur Foundation and by NSF 
grant DMS-0649473. 

2. Reducing to the equidistributed case 

To prove Theorem ll.il we will apply [HI Theorem 1.19] to decompose (7 as a product 
eg''") where e is "smooth" , 7 is "rational" and g' is highly equidistributed in some closed 
subgroup G" C G. We will recall the precise statement shortly. 

In this section we shall show how the rather harmless factors e and 7 in the above 
factorisation may be eliminated, and then make an additional reduction to the case 
JqiyF = (using the Haar measure on G/F, of course). This leaves us with the task 
of proving an "equidistributed" case of Theorem 11.11 see Proposition 12.11 below. 

For the rest of the paper, all constants c, G, including those in the asymptotic notation 
<^ and 0(), are allowed to depend on m and d. Different occurrences of the letters c, G 
may represent different constants; typically we will have 0<c^l^G<oo. For ease 
of notation we drop the subscript whenever Lipschitz norms are mentioned, so ||-F||Lip 
becomes simply ||-F||. 

Recall from |9, Definition 1.3(v)] that a sequence {g{n)T)n<^yN^ in a nilmanifold is 
totally 5 -equidistributed if we have 

\&nepF{g{n)T)\^5\\F\\ (2.1) 

for all Lipschitz functions F : G/F — t- C with J^ ,p F = and all arithmetic progressions 
P C [A^] of length at least 5N . 

In the next section we shall establish the following result about the lack of correlation 
of Mobius with equidistributed nilsequences. 

Proposition 2.1 (Mobius is orthogonal to equidistributed sequences). Let m ^ 0, 
d ^ 1 be integers and let N ^ 1 he an integer parameter which is sufficiently large 
depending on m and d. Let 6, < 6 < 1/2, and Q ^ 2 be real parameters. Let 
G/F be an m-dimensional nilmanifold, and suppose that G, is a filtration of degree d. 
Suppose that G/T has a Q -rational Mal'cev basis X adapted to the filtration G,. Let 
g e poly(Z, G,) and suppose that {g{n)r)ne[N] 'is totally 6 -equidistributed. Then for any 
function F : G/F — )• M with j^,^ F = and for any arithmetic progression P C [N] of 
size at least N/Q, we have the bound 

\E^^[N]fi{n)lp{n)F{g{n)r)\ « 6'Q\\F\\ log AT. 
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The proof of Proposition 12.11 proceeds via the method of Type I/II sums, which is 
also known as the method of bihnear forms. This is the same method that one might 
use to tackle the "minor arcs" case of (11.11) . where a is not close to a rational with 
small denominator. We will describe it in detail in the next section. Our task for the 
remainder of this section is to reduce Theorem 11.11 to Proposition 12.11 

Proof that Proposition \2.1\ implies Theorem \l.l[ We start with a brief overview. The 
main ingredient of this argument is [HI Theorem 1.19], that is to say the factorization 
g = eg''-)' mentioned above. In addition to that we require estimates for sums of the type 
^■nelN]fJ'{n)lp{n), where P C [A^] is a progression. After standard harmonic analysis, 
such bounds ultimately depend on results about the zeros of L-functions L(s,x), and 
as such this is analysis of the same type as would be used to establish the "major arc" 
cases of fll.ip . Finally, a fair amount of what might be called "quantitative nil-linear 
algebra" is required to keep track of the various nilmanifolds and Lipschitz functions 
involved in the argument. Here we draw repeatedly on the material assembled in [HI 
Appendix A] for this purpose; we encourage the reader to gloss over these essentially 
routine issues on a first reading. 

We now turn to the details. We allow all implied constants to depend on m and d. 

Let the hypotheses be as in Theorem 11.11 To simplify the notation slightly we will 
also assume that ||F|| ^ 1; the case ||F|| < 1 can easily be deduced from that case. By 
dividing out by ||-F|| we may in fact normalize and assume that ||-F|| = 1. 

We may of course take A ^ 1. We may also assume that Q ^ log A^, since the claim 
is vacuously true otherwise; thus X is now a log A^-rational Mal'cev basis. By increasing 
A if necessary, it will suffice to show an estimate of the form 

|E„e[7V]/iHF((?(n)r)| «a log'^+^^i) N. (2.2) 

Let 5 be a parameter (depending on A) to be specified later. We may assume that A^ 
is sufficiently large depending on A,B. By [9l Theorem 1.19] (with Mq := logN) we 
can find an integer M, 

log AT ^ M < log°^(^) AT, 

a rational subgroup G" C G, a Mal'cev basis X' for G'/T' (where T' := G H P) in which 
each element is an M-rational combination (see [9[ Definition 1.21]) of the elements of 
X, and a decomposition 

9 = eg'^ (2.3) 

into polynomial sequences £,(?', 7 G poly(Z, G,) with the following properties: 

(i) e:Z^G, is (M, A^)-smooth (see P Definition 1.22] for a definition); 
(ii) (yf' : Z — 7- G' takes values in G', and the finite sequence ((7'(n)P')„g[7v] is totally 

M^^-equidistributed in G'/T', using the metric d;^/ on G'/T'; 
(iii) 7 : Z — )■ G is M-rational (see P Definition 1.21]), and (7(n)P)„g2 is periodic 

with period 1 ^ g ^ M. 

From (12. 3p we have 

E„e[^]/i(n)F((7(n)P) = Ene[N]Kn)F{e{n)g'{n)^{n)r). (2.4) 

The sequence (7(n)P)„gz is periodic with some period q, 1 ^ q ^ M. For each j = 
0, 1, . . . , g — 1 let 7j := {7(j)} be the fractional part of 7(j) with respect to P, thus 
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■jjT = 7(j)r and all the coordinates ipxi^jj) lie in [0, 1). This construction is described 
in [HI Lemma A. 14]. 

Now by P Lemma A.12], the coordinates ipx{j{j)) lie in jpZ"^ for some M' < M'^'^'^K 
Since jj = 7(j)?7 for some rj with integer coordinates, it follows from [21 Lemma A. 3] 
that the coordinates ipxijj) are rationals with height <^ M'^^^\ 

We now take advantage of the periodicity of ■y{n)T to split the right-hand side of 

(1231) as 

g-i 

^EnelN]f^{n)ln=j{modq)F{e{n)g'{n)-fjT); (2.5) 

j=0 

By the right-invariance of d, the (M, A^)-smoothness of e (see [9l Definition 1.21]) and 
the 1-Lipschitz bound on F we see that 

\Fiein)g'inh^T) - F{eino)g'inh,T)\ ^ d;,(£(n)(7'(n)7„£(no)^'(n)7,) 

= dx{e{no),e{n)) 

^ log-^ N. 

whenever if |n — no I ^ ^^ . Hence if we split each progression n = j(mod q) into 
further progressions Pj^k for k = 0(M log A^), each having diameter at most ^^ , 
we see that (12. 5p is equal to 



J2^nelN]Kn)lp^^^{n)F{a,,kg'{n)^,T) + 0(log-^ AT). (2.6) 

Here each a^ ^ := e{noj±) for some noj,fc £ Pj,k', by the definition of what it means for 
e : Z -)■ G to be (M, A^)-smooth (i.e. [SJ Definition 1.21]), it follows that dx{aj^k, idc) ^ 
M and hence, by [9l Lemma A. 4], that 

|V^;t(a,,fc)|«M°«. (2.7) 

If A^ is sufficiently large depending on A and B then A^ ^ lOMlog A^ (say), and this 
partition of [A^] may be arranged in such a way that 

AT N 

■'' ^ 2gMlog^A^ ^ 2M2 1og^Ar' 



Since the number of j is at most M, and the number of k is at most M log A^, we 
thus see that to show (12. 2p it suffices by the triangle inequality to show that 

|E„e[^]/i(n)lp^.,(n)F(a,-,(7'(n)7,r)| «a M'^ log-^^^^^^) AT (2.8) 

for each j, k. 

Fix j, k. Write Hj := •yJ^C'yj and let Qj : 2, ^- Hj be the sequence defined by 
9ji^) '■— lj^9'{f^)lj- It is clear that each gj is a polynomial sequence with coefficients 
in the filtration [Hj), := 7j"^G',7j. 

Set Aj := F n i/j and define functions 

F,-fc:i7,/A, ^[-1,1] 

by the formula 

Fj^k{xJ^j) ■= F{aj^kljxT). 
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Then (12.81) can be rewritten as 

|E„e[7V]/iHlp^,,,(n)F,-fc((7,-(n)A,)| «^ M'^ log-^^+^^i^ iV. (2.9) 

Suppose for the moment that Fj ^ were a constant function. Recall that Pj^k has 
common difference q ^ M. We may thus apply Proposition IA.2I (with A replaced by a 
sufficiently large exponent A' depending on A and B) to obtain the desired claim, since 
M <^ log '^^^' N. Therefore we may subtract off the mean of Fj ^ and assume without 
loss of generality that /h-.m -^i,A: = 0- This may cause Fj^ to take values in [—2,2] 
rather than [—1, 1], but we can easily counter this trivial issue by dividing Fjj^ by two. 



In a moment we shall use Proposition 12 . ll to estimate the terms appearing here. Before 
doing that we record quantitative rationality properties of the nilmanifold Hj/Aj, as 
well as a Lipschitz bound on ||-Fjfc||. 

Claim. There is a Mal'cev basis 3^j for Hj/Aj adapted to the filtration (Hj), such 
that each yj is an Af-^-rational combination of the Xj. With respect to the metric 
dyj on Hj/Aj induced by this basis, the polynomial sequence gj G poly(Z, (Hj),) is 
^-cB+o(i)_j.Q^g^jjy equidistributed for some c > depending only on m, d, and we have 

||F,J| <:mow. 

Proof. We shall apply suitable combinations of the lemmas in [HI Appendix A]. 
The existence of yj follows from Proposition A. 9 and Lemma A. 13 of ^ together 
with the fact that each "fj has rational coordinates with height M'-^^^\ Now the map 
X 1-^ F{aj^kljxV) on G /T has Lipschitz constant at most M*^^^^ by [H Lemma A. 5] 
and the bounds \il)x{a'j,k)\-, \'^x{lj)\ ^ M^'^^\ The last statement of the claim follows 
from [9l Lemma A. 17]. Finally, the statement about the quantitative equidistribution 
of Qj (in which is hidden a slight technical subtlety) is addressed by Lemma IB. II in the 
appendix of this paper. D 

Let us now apply Proposition 12. II to (12. 9p . We apply the proposition with parameters 
(which we distinguish using tildes) as follows: G := Hj, T := Aj, G, := (Hj),, g := gj, 
X := yj, Q := MO(i), F := Fj^k and 6 := M-^^+o^^\ We quickly see that ([MD is 
bounded by O ( M"'^^^*^*^^-' log ' •* A^ j . Choosing B sufficiently large depending on A, 
we obtain (12. 9p as claimed. D 

3. The equidistributed case: Type I and II sums 

In this section we establish Proposition 12 . 1 1 using Vinogradov's method of Type I and 
II sums in the form due to Vaughan ^16j. More precisely, we will use the following 
proposition. 

Proposition 3.1 (Method of Type I/II sums). Let f : N ^ C be a function with 
, ^ 1 such that 



\^N<n^2NfJ'in)fin)\ ^ e 
for some e > 0. Then one of the following statements holds: 

• (Type I sum is large) There exists an integer 1 ^ K ^ j\['^/'^ such that 

\^N/k<n.^2N/kfikw)\ > (£/l0gAr)0(l) (3.1) 
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for > (£:/logiV)*^(^)i^ integers k such that K < k ^ 2K. 

• (Type II sum is large) There exist integers K, W with ^N^^^ ^ K ^ 4A^2/3 ^^^^^ 
N/A ^ KW ^ AN, such that 

\^K<k,k%2K^W^^,.n'<2wfikw)fik'w)f{kw')f{k'w')\ > {s/hgNf^'l (3.2) 

Proof. This is [3 Proposition 4.2], specialised to the case U = V = N^^^, and with 
certain explicit exponents replaced by unspecified constants 0(1). D 



We now begin the proof of Proposition 12.11 As before we may normalise so that 
||F|| = 1. From this and the mean zero assumption, we see in particular that 

\F{x)\ ^ diam(G/r) < Q^^^'^ (3.3) 

for all X G G/T (the diameter bound here is [91 Lemma A. 16]). 

If 5 ^ 1/N then by (^^ we have \F{g{n)T)\ < 5 for all n e [A^], and the claim is 
trivial, so we may assume that S > 1/N. By increasing S if necessary (and shrinking c) 
we thus see that we may assume that 

6 > N-" (3.4) 

for any fixed small constant cr > depending only on m, d. 

The basic idea, which will become clearer upon reading the details, is to make good use 
of the fact that one may test the quantitative equidistribution properties of a polynomial 
nilsequence on G /V by passing to the abelianisation (G/r)ab, a phenomenon referred 
to in [HI Theorem 2.9] as the "quantitative Leibman Dichotomy" (cf. [I2])- The abelian 
issues that one must then deal with are of a very similar nature to those involved in 
dealing with exponential sums such as E„g[Ar]/x(n)e(p(n)), where p : R — )■ M/Z is an 
ordinary polynomial. Rather than quote results from the existing literature on this 
problem it is easier for us to invoke various lemmas from [9j, which were stated and 
proved in a language which is helpful for the present paper. 

Let e := 5'^'^Q\ogN , for a constant Ci to be specified later. We may assume that 
e < 1, otherwise the claim is trivial from (13. 3 p and the triangle inequality. In particular, 
we have 

Q,\ogN ^5^^^ 

and we will use these estimates frequently in the sequel to absorb any polynomial factors 
in Q or log A^ into a power of 5~'^'^ . 

Suppose for contradiction that Proposition 12.11 failed for these parameters. We then 
apply Proposition 13.11 with f{n) := lp{n)F{g{n)T) and e as above, concluding that 
either (13. ip or (13. 2p holds. We deal with these two cases in turn. 

The Type I case. Suppose that (13. ip holds. Thus there are 3> S^^'^^^K values of 
k e (AT, 2A:] such that 

\EN/k<y.^2N/klp{kw)F{g{kw)r)\ > 6'^^''\ 

Let / denote the common difference of P; since \P\ ^ N/Q, we must have 1 ^ I ^ Q. 
Splitting into progressions with common difference /, we see that for some 6(mod /) and 



»5«(-)^, (3.5) 
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for > 5^^^'^K values of fc e {K, 2K] we have 

N/k<w^2N/k 
■w=b{raoAl) 

Setting w = h + Iw', this may be rewritten as 

Y,F{9iHb + lw')T) 

where h C [^ — 1,^] is an interval. 

For each value of k for which this holds, consider the sequence gu : 1^ ^ G defined 
by Qkin) := g{kn) and also the sequence gk : I^ —^ G defined by gk{n) = g{k{h + In)). 
It follows from P Corollary 6.8] that gk^gk G poly(Z, G,). Now (13. 5p implies that 
{cjk{n)V)n^\^Nk\ f^ils to be ^'^'^'^^^-equidistributed in G/F, where N^ ~ N/kl. 

It follows from [3 Theorem 2.9] that there is a nontrivial horizontal character ip^ '■ 
G — )■ M/Z (i.e. a continuous homomorphism from G to M/Z which annihilates F) with 
magnitude \il)k\ ^ 5^^*^^^^ such that 

UkO~gk\\c^[N^] <5-^('^^\ 

Recall from [9l Definition 2.10] that the C°°[A^]-norm of a polynomial p : Z — t- R/Z 
expanded in binomial coefficients as 

p{n) = ao + aA\ ^ ^"'^(rf)' ^"^'^^ 

is defined by 

lbllc-'[Af] := sup N^\\aj\\^ii. 

By [9] Lemma 8.4] (specialised to the single-parameter case t = 1), there is some 
Qk < (^"^('^i) such that 

hk^k o 9k\\c^[N„] < S'^^^'K 

Pigeonholing in the possible choices of qkipk, we may find some ip with < {ipl <^ 

||z^o(;,||c,o.[^^^]«r«(^i) (3.7) 

for > 6^^^^'>K values of fc G (fsT, 2K]. 

Write 

^ o ^(n) = /3,n'^ + ■ ■ ■ + /3o. (3.8) 

Then 

^o^fc(n) = /3,fcV + --- + /3o. (3.9) 

We would like to use this and (13. 7p to conclude that the coefficients y(3j are close 
to being integer (or rational with small denominator). This will follow from a simple 
lemma. 

Lemma 3.2. Suppose that p : Z — )■ R/Z is a polynomial of the form p{n) = Pd^'^ + 
■ ■ ■ + /3o- Then there is some q ^ 1, q = 0(1), such that \\qf3j\\K/z ^ ^~"'lbllc°°[iv] for 
j = l,...,d. 
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Proof. Consider the representation (13.61) which is used to define the C°°[N]-noTm.. 
Observing that Pj can be written as a hnear combination oi aj, . . . ,ad with rational 
coefficients of height 0(1), the result follows upon clearing denominators. D 

From fl3.7p . fl3.9p and Lemma 13.21 we see that there is some q ^ 1, q = 0(1), such 
that 

\\qyf3,\W/z « rO(=^)(iV/K)-^- (3.10) 

for J = 1, 2, . . . , ci and for at least d'^^'^^'^K values of A; G {K, 2K]. 

Fix j, 1 ^ j ^ d. To pass from the j*'^ powers y to more general integers we shall 
need the following Waring- type result. 

Lemma 3.3. Let K ^ 1 be an integer, and suppose that S C [K] is a set of size aK . 
Suppose that t ^ 2^ + 1. Then '^j^t a^^K^ integers in the interval \tK^] can he written 
in the form k{ + ■ ■ ■ + kl, ki, . . . ,kt G S. 

Proof. It is a well-known consequence of Hardy and Littlewood's asymptotic formula 
for Waring's problem (see e.g. [T7]) that the number of solutions to 

x{-\ Vxl = M, xi, . . . xt e [K] 

is ^j^t K*~^ uniformly in M provided that t ^ 2-^ + 1. (In fact, by subsequent work, such 
a result is known for much smaller values of t when j is large.) Let X = {y : k & S} 
and let r{n) be the number of representations of n as the sum of t elements of X. Then 
by the Cauchy-Schwarz inequality and the preceding remarks we have 

^2t^2t ^ (^r{n)f ^ \tX\Y,r{nf «,■ \tX\K^'-\ 

n n 

which implies the result. D 

By (13.1 up and Lemma 13.31 it follows that 

\\qif3,\W/z<t:6~'^^''^\K/Ny 

for > d^^^^^K^ values of / G [WK^]. 

The following lemma, which is |9i Lemma 3.2], may be applied to this situation. 

Lemma 3.4 (Strongly recurrent linear functions are highly non-diophantine). Let a G 
M, < o" < 1/2, and < fi ^ cr/2, and let I C M/Z be an interval of length n 
such that an G / for at least aN values of n E [N] . Then there is some /c G Z with 
< |fc| < (T-^(^) such that ||A;a||iK/z < fia^^'^^^/N. D 



Let us attempt to apply this lemma with a > 6'^'-''^^ and /i < S'^^^^^K/Xy . If A^ 
is sufficiently large and the exponent a in (13. 4p is sufficiently small, we see using the 
bound K/X ^ X~^^^ that the hypotheses of the lemma are satisfied and that such an 
application is permissible. The conclusion is that there is some q', 1 ^ g' <^ 5~^^'^^\ 
such that 

\\qq'P^\\R/z « r^(^^)iV-\ (3.11) 



Writing ip := qq'ip, it follows from (13. 8 p and (13. lip that for any n we have the bound 

||^o^(n)||K/^«r°(^^)n/iV. 
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If A^' := 6^'^^N for some sufficiently large C, and if n G [A^'], this implies that 

Uog{n)\\u/z^l/lO. (3.12) 

Now set F : G/T -^ [-1, 1] to be the function F := r/ o ^, where r] : M/Z -^ [-1, 1] 
is a function of Lipschitz norm 0(1) and mean zero which equals 1 on [—1/10, 1/10]. 
Then we have Jg,,p F = and ||-F|| <^ 6^'^^'^^\ From (13.121) . we have 

\Ene[N']F{g{n)T)\;,l>6\\Fl 

provided that Ci is chosen sufficiently small. This is contrary to the assumption that 
{g{n)T)ne[N] is 5-totally equidistributed. 

The Type II case. This is in many ways very closely similar to the Type I case, as the 
reader will see. Recall the situation that (13. 2 p puts us in (with our choice of e): there 
are K, W with ^N^/^ ^K ^ AN'^/^ and N/A ^ KW ^ AN such that 

\^K<k,k'^2K^W<v.,^'<.2wf{kw)f{kw')f{k'w)f{k'w')\ » 5''^'''\ 

where f{n) = lp{n)F{g{n)r). Writing the left-hand side here as 

^K<k,k%2K\^W<w^2wf{kw)f{k'w)\^, 

we see that there are :» 6'^'^'^^^K'^ pairs {k, k') G (iC, 2KY such that 

\¥.w<.n^2wf{kw)f{k'w)\:^5''^'''\ 

Written out in full, for each such pair (/c, k') we have 

\^w<^^2wlp{kw)lp{k'w)F{g{kw)T)F{g{k'w)T)\ » {e/\ogNf^^\ 

Writing / for the common difference of P (thus 1 ^ / ^ Q) we see that there is some 
6(mod /) such that for ^ {e/\ogN)^^^' K"^ pairs {k,k') we have 

J2 lp{kw)lp{k'w)F{g{kw)T)F{g{k'w)T)\ > (5^(^i^^. 

W<w!^2W 
w=b{modl) 

Setting w = Iw' + b, this may be written as 

I Yl F{g{k{b + lw')T)F{g{k\b + lw'))T)\:^6'^^''^^, (3.13) 

where Ik^k' ^ (7- ~ Ij ^] is an interval. Since 1 ^ I ^ Q, which is bounded by a small 
power of A^, and W ^ iV^/^, this is contained in [^, ^]. 

For each k, k' for which this holds, consider the sequence gk^k' : Z — t- G x G defined 
by gk,k'{i^) = {g{kn),g{k'n)), and also the sequence c/k^k' : Z — )■ G x G defined by 
gk,k'{n) = {g{k{b + ln),g{k'{b + In))). It follows from P Corollary 6.8] that gk,k',gk,k' ^ 
poly(Z, G, X G,). Now from (I3.13P we see that the sequence {gk,k'{n)(T x r))ne[iVj, ^z] 
fails to be ^^('^i^-equidistributed in (G/T) x (G/F), for some Nk,k' e [f , ^]. 

It follows from [9l Theorem 2.9] that there is a nontrivial horizontal character ipk.k' '■ 
GxG ^ M/Z with \^k\ '^ 5-^(^1) such that 

\\i'k,k' ° gk,k'\\c°-[N^,k,] < 5'^^"'^. 
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By [SI Lemma 8.4] there is some qk,k' ^ (5^'^('=i) such that 

Pigeonhohng in the possible choices of qk,k''4'k,k' ■, "we may find some ip with Q <\il)\ <^ 

||^^o^,_,,||t,o.[^^^^,]«r^(^^) (3.14) 

for > 5^(^i)ir2 pairs k, k' G {K, 2K]. 

Write ip = ipi(Bip2, where iIji,iIj2 '■ G ^)- R/Z are horizontal characters, not both zero. 
If 

ifji o g{n) = Pdu'^ + ■ ■ ■ + /3o 
and 

then 

^ o gk,k'{n) = {Pdk'' + P'dk"')n'' + . . . + (/3o + /?;), 
By Lemma [3.21 and (13.141) there is some 1 ^ g ^ ^-o(ci) ^^^^ ^j-^g^^^ 

Uk^p, + 0(5'^)\W/^ « r^^'^^^iv-^, « r«(^^)(ir/iv)^- 

for j = 1, 2, . . . , rf and for > 5^^^^^K^ pairs A;, k' G (iT, 2K]. 

Suppose, without loss of generality, that iIji ^ 0. Selecting some k' that occurs in 
^ ^o(ci)^ q£ ^]^g pairs /c, k' and subtracting, we see that 

\\qy(5,\W,z<^5-^^''\K/Ny (3.15) 

for > 5°^''^'>K values of A; € (-/T, iT). Using the bounds K > N^''-^ and ([S3D it follows 
that we may ignore the contribution of A; = 0, that is to say (I3.15P holds for ^ 5'^^'^^^K 
values of A; G [1, -ft']. 

Remark. Note carefully that (I3.15P carries no information when k = Q. In our 
treatment of Type I sums there was no need for a lower bound on K, but such an 
assumption is essential if one has any desire to bound Type II sums. 

The estimate (I3.15P is identical to (I3.10p . We may now repeat the arguments used to 
obtain a contradiction to (I3.10p in Type I case. The proof of Proposition 12.11 and thus 
Theorem 11.11 is now complete. D 

The main business of the paper is now complete. In the next section we give a brief 
discussion of how our argument compares with the classical Hardy-Littlewood method. 
After that we give a number of applications of Theorem II. 1[ 



4. Remarks on a nilpotent Hardy-Littlewood method 

It may be of interest to interpret our method in terms of the "major and minor 
arcs" terminology of the Hardy-Littlewood method. Recall that to prove Davenport's 
estimate 

|Ene[Ar]/i('^)e(an)| <^ log^^A^ 

one divides into two cases: the major arcs where a is close to a rational with small 
denominator, and the minor arcs where it is not. The major arcs are handled using 
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L-function technology as in Appendix [XJ and the minor arcs are handled using Type 
I/II sums as in Proposition 13.11 

Suppose that we are considering the sum 

Ene[N]Kn)F{g{n)T), 

where J„ ,p F = 0. Decompose g as a product eg''j where e is smooth, 7 is rational 
and g' is highly equidistributed on some subgroup G' . Then one might think of 5^ as a 
"major arc" nilsequence if G' = {idc}, and as "minor arc" if G' is nontrivial. 

To justify this terminology, observe that one may interpret e{an) as F{g{n)r), where 
G/r = M/Z, (yf : Z — 7- R is the polynomial sequence g{n) = an and the Lipschitz 
function F, taking values in the unit ball of the complex plane, is simply e{6). 

li a = - + e, where e is small, then the decomposition g = eg''-) will be given by 
e{n) = en, g'{n) = id^ and •y^n) = an/q and so this does indeed correspond to a "major 
arc nilsequence" . 

If a is not close to a rational with small denominator then g{n) will already be highly 
equidistributed on M/Z, and so the decomposition g = eg''-)' has e = 7 = idc and g' = g. 
Thus G" = R is nontrivial and this corresponds to a "minor arc nilsequence" . 

5. On bracket pglynomials 

By a bracket poly-nomial we mean an object formed from the scalar field R and the 
indeterminate n using finitely many instances of the standard arithmetic operations +, 
X together with the integer part operation [ J and the fractional part operation { }. The 
following are all bracket polynomials: n^+n\/2, n\/2\n\/?>\ and {n^ \/2+n'^ \n\/b\+\/l} . 
One may associate a notion of complexit'y to any bracket polynomial p(n), this being 
(for instance) the least number of operations +, x , [ J , { } required to write down p. In 
view of the relation {x}+ [a;J = x, it is not strictly speaking necessary to retain both the 
integer and fractional part operations, but we do so here for convenience. Dispensing 
with one of them would slightly alter the definition of complexity. 

The following remarkable theorem of Bergelson and Leibman [2] demonstrates a close 
link between bracket polynomials and nilmanifolds. If G/T is a nilmanifold with Mal'cev 
basis X then recall from [9] Lemma A. 14] that the coordinate map if) : G ^ R™ 
provides an identification between G /V and [0, 1)™'. Write ri, . . . , r^ for the individual 
coordinate maps from G/T to [0, 1), that is to say Tj is the composition of %p with the 
map (ti,...,tm) ^ U. 

Theorem 5.1 (Bergelson-Leibman). The functions of the form n i— t- {p{n)}, where p 
is a bracket pol'ynomial, coincide vuith the functions of the form n 1— t- Ti{g{n)r), where 
G/T is a nilmanifold equipped with a Mal'cev basis X and g : X ^- G is a pol'ynomial 
map with coefficients in some filtration G,. The rationality of X , the dimension of G, 
the degree of g and the rationality of G, may all be bounded in terms of the complexity 
ofp, and conversely the complexity of p may be bounded in terms of these quantities. 

In fact, Bergelson and Leibman prove a number of rather refined variants of this 
type of result, and they also give a comprehensive and edifying discussion of bracket 
polynomials in general. At first glance it appears that one might immediately combine 
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Theorem 15.11 with Theorem 11.11 to obtain a resuh about the correlation of the Mobius 
function with bracket polynomials. There is a serious catch, however: the coordinate 
functions Tj are not continuous on the nilmanifold G/T. Furthermore, as observed by 
Bergelson and Leibman, there are bracket polynomials which cannot be written in the 
form F{g{n)T) for a continuous F. Indeed the results of Leibman []2] on the distribution 
of {g{n)V)n^i imply that the sequence (F((yf(n)r))„gz cannot have isolated values, yet 
there are bracket polynomials which do. A simple example is [1 — {nv2}J, which is 
zero except when n = 0. 

One does nonetheless feel that the discontinuities of Xj are "mild" , as this function is 
continuous on that part of G/T which is identified with (0, 1)™. However, the sequence 
{g{n)V)n^z may well concentrate on a highly singular subset of G/F, as we discussed at 
length in [9] . Thus a certain amount of further work is required to obtain the expected 
result, which is the following. 

Theorem 5.2 (Mobius and bracket polynomials). Suppose that p{n) is a bracket poly- 
nomial and that \l/ : [0, 1] — )■ [—1, 1] is a Lipschitz function. Then we have the estimate 

E„e[^]/i(n)^({p(n)}) <a,* log"^ A^, 

where the implied constant depends only on A, "$ and the complexity of p {but is inef- 
fective). 

We shall illustrate how this theorem may be deduced from Theorem 1 1.1 1 by discussing 
two related special cases. We will then sketch the details that are required in order to 
write down a complete proof. The authors plan to include a complete proof of Theorem 
15.21 in a future publication. 

Both special cases will take place on the Heisenberg nilmanifold G/T, where 

G=oiR,F=oiz. 
Vooi/ Vooi/ 

Computations with Mal'cev bases in this setting were given in [71 Appendix B] and then 
again in [9l §5], where we took 



ei =exp(Xi) = (oio) ,62 = exp(X2) = (oii) ,63 = exp(X3) = (^ 



101 
010 
001 



We briefly recall some of the computations carried out in somewhat more detail in that 
paper; in any case the proofs are nothing more than computations with 3x3 matrices. 
The coordinate function ■?/' : G — !■ M^ is then given by the formula 



^ ((ooD) " (^'^'^~^^)' 



and the element written here is equivalent, under right multiplication by an element of 
F, to the element with coordinates 

Note that this lies inside the fundamental domain [0, 1)^. It follows that, for any a, P E 
M, we have 

{nl3[na\} = T3{g{n)T), 
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where T3 : G/T — )■ [0, 1) is the map into the third coordinate and (7 : Z — )■ G is the 
polynomial sequence given by 

/I na ri^afi 

g{n) =01 n/3 
Vo 1 

This is an explicit example of the representation of a bracket polynomial, in this case 
{n/3[naj}, in the form discussed in Bergelson and Leibman's theorem. 

We discuss two different cases. 

Case 1. a = \/2, (3 = \/3. Then the sequence {g{n)T)n^[N] is totally A^~'^-equidistrib- 
uted on G/T, which makes life rather easy. To prove the equidistribution one may use 
[9l Theorem 2.9] together with the lower bound 

min \\k1V2 + k2V3\\u/z :^ K'^ , 

(fcl,fc2,fc3)7^(0,0,0) 

which follows from the fact that, for any ^3 with |A;3| ^K^ ki\/2 + k2\/3 + k^ satisfies a 
quartic over Z with coefficients of size K^^^\ Although the function T3 is not continuous, 
it is continuous outside of a subset of G/T of measure zero, namely outside of [0, 1)'^ \ 
(0, 1)^. This means that it may be approximated by Lipschitz functions. More precisely, 
for any fixed Lipschitz function \E' : [0, 1] — )■ [—1, 1] and any e > one may find functions 
Fi,F2 : G/r ^ C with llFilU, IIF2II00 ^ 1, Ili^illLip, ||F2||Lip ^ e~^^^\ |$ 0x3 - Fi| ^ F2 
pointwise and J^,-^ F2 ^ e. From Proposition (12. ip we have 

and the uniform distribution of {g{n)T)n(:[N] implies that 

E„e[^]F2((7(n)r) ^ 6 + 0(£-°(i)iV-^). 

Now we have the bounds 

\Ene[N]Kn)^inV3[nV2\)\ = \E^^[N]Kn)^ o T3{g{n)T)\ 

^ \Ene[N]Kn)Fiig{n)r)\+Ene[N]F2{gin)r). 

Letting e = A^~'^ for some sufficiently small c' > 0, we obtain an effective and much 
stronger version of Theorem 15.21 in this case, namely the bound 

E„e[^]/i(n)^({ny3[ny2j}) < A^-^ 

Case 2. a = /3 = a/2. Now the sequence {g{n)T)n^[N] is manifestly not uniformly 
distributed on G/T. In fact g takes values in the one- dimensional subgroup G' (^ G 
defined by 

G' = {(lrT):xeM}. 

Vo 1 / 
The preceding argument breaks down. One could appeal to Theorem 11.11 instead of 
Proposition 12. H but the problem comes when one tries to control the term 

Without knowing something more about the relation between the support properties of 
F2 and the orbit {g{n)T)n^[iy^, it is not possible to control this term. 
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In the case at hand {g{n)r)n£[N] is A^~^-equidistributed in the nilmanifold G'/V where 
r' := r n G. Topologlcally and algebraically this nilmanifold is nothing more that M/Z, 
but one should note carefully that the Haar measure on this nilmanifold is not the same 
as the measure induced from the Haar measure on G. This may be used to "explain" 
the observation that ny2\ny2\ is not uniformly distributed modulo one; see |2] for 
further details. 

Inside G/F, G'/V may be identified with the union of two segments 

{ 01 X :0^x<l}U{ 01 X :0^x<l}, 

Vooi/ Vooi/ 

and this makes it clear that the induced map ra : G'/V — )■ [0, 1) is continuous away 
from a single point. By an analysis very similar to the preceding one it may once again 
be shown that 



E„e[^]/^(n)^({nV2[nv^J}) < iV-'= 
for any fixed Lipschitz function ^ : [0, 1] — ?■ [—1,1]. 

Amongst examples of the form nl3\na\ there is a third distinct case, typified by 
a = (3 = 2^/^. We leave the analysis of this to the reader. 

Sketch proof of the general case of Theorem I5.M By Theorem 15. 1[ the result of 
Bergelson and Leibman, it suffices to show, for any fixed Lipschitz function \E' : [0, 1] — )■ 
[-1,1], that 

E„e[7v]/i(^)(^ o ri){g{n)T) <€.a log'^N. 

Here, the notation and parameters are as described in Theorem 15. II Now Xj is continuous 
outside the set [0, 1)"^ \ (0, 1)"*, which has zero measure in G/T . The issue lies in 
understanding how the orbit {g{n)V)n^[N] interacts with this. 

Now the main results of [S] allow us to get a handle on this situation. Consider in 
particular the decomposition of g as eg'"^ which was obtained in [9, Theorem 1.19]. 
Recall that e : Z — )■ G is slowly varying, 7 : Z — )■ G is rational and (7' : Z — )■ G' is 
such that {g'{n)V)ne[N] is totally equidistributed. For a full proof of Theorem 15.21 one 
would naturally need to specify appropriate quantitative parameters here. Suppose for 
simplicity that e = 7 = idc (this was, in fact, the case in the two examples above). 

Choose a Mal'cev basis for G'/V with coordinate map ip' : G' ^f M™ . Then G'/V 
may be identified with the region '?/''~^([0, 1)™ ) C G, and in this way we think of the 
coordinate function Tj as a function on G'/V. Write fj for the corresponding function 
on [0, 1)™ . It can be shown, making extensive use of the results of P, Appendix A], 
that fj is continuous outside of a piecewise polynomial set of positive codimension, that 
is to say outside of a finite union of sets each of which is defined by some polynomial 
inequalities a ^ P{ti, . . . ,tm') < b and at least one nontrivial polynomial equation 
Q{ti, . . . ,tm') = c. Related matters are discussed at greater length in [2]; in the two 
examples we discussed, these piecewise polynomial sets were rather simple. These sets 
are certainly well-behaved enough that Tj may be approximated using Lipschitz functions 
Fi and F2 as in our treatment of the bracket polynomial n\/3\n\/2\, and in this way 
one may use Theorem 11.11 to obtain the desired bound 

E„e[jv]/i(ri)(^ o ri){g'{n)T) <a log"^ N. 
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If G' 7^ {id} then one may in fact use Proposition 12.11 to obtain the stronger bound of 
N~'^, as in the examples. 

If e and 7 are not trivial it is even more complicated to write down a fully rigorous 
argument, but conceptually things are not much harder at all. The introduction of the 
smooth function e{n) has a rather benign effect; if n ranges over an interval of length 
5'N , for suitably small 5' = S'{6), the discontinuities of the functions x 1— )■ Ti{e{n)xr) 
are all contained inside a "nice" set of measure at most 6, and one may proceed much 
as before. All one need do, then, is split the range [N] into suitably short intervals of 
this type. 



The introduction of 7 may be handled much as it was in the proof of Theorem 11.11 
One splits each of the intervals from the previous paragraph into progressions Pj with 
the same (small) common difference q such that 7(^)r is constant and equal to 7jT on 
P. One then works with the conjugated sequences 'yj^g'{n)'yj as we did at the end of 

m □ 



We conclude by remarking on some variants and generalizations of Theorem 15.21 If 
Pi, . . . ,pm are bracket polynomials and F : (M/Z)^^ — > C is a smooth function then one 
could establish the estimate 

E„6[^];u(n)F({pi(n)}, . . . , {pMin)}) <a log~^ A^ 

by Fourier decomposition of F and Theorem 15. 2[ One could, if desired, restrict the 
range of the average to some fixed subprogression P C [A^] by the standard technique 
of approximating the cutoff lp(n) by a smoother function lp(n) and then developing 
this as a Fourier expansion. 



6. The Liguville functign 

Everything we have proved for the Mobius function also holds for the Liouville func- 
tion A:N— !■{ — 1,1}, defined to be the unique completely multiplicative function such 
that \{p) = —1 for all primes p. This function is related to the Mobius function via the 
identity 



r:r'^\n 



P) 



Thus, with the notation and assumptions of Theorem II. H we have 



X{n)F{g{n)T)\<^ ^ ^\E„,^[N/r2^fiim)F{gir^m)T)\. 






Now by [9'j Corollary 6.8] m H- g{r'^m) is a polynomial sequence with coefficients in the 
same ffitration G, a.s g, and so we have the bound 

|E„e[7vM/^MF((7(r2m)r)| <„,,,AQ^-'^'^(^ni+ ll^llLip)log-^(Ar/r2) 
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uniformly in r, so long as N/r"^ ^ 2. Summing over r we obtain 

|E„e[7V]A(n)F(^(n)r)| «^,,,^ Q0^-^^-^^){1 + ||F||Lip)( Y. \^og-\N/r^) 



mVN/2 

+ E ^) 

This is precisely Theorem ll.H but with A taking the place of /x. In a similar fashion, 
all of the results of the preceding section concerning bracket polynomials may now also 
be deduced with A in place of /x. 

7. A RECURRENCE RESULT ALONG THE PRIMES 

In this section we derive the following result. Here Pi,P2,P3, • • • is the sequence of 
primes. 

Theorem 7.1 (Prime return times on a nilmanifold). Suppose that G/T is a nilmanifold 
and that g E G is such that left-multiplication by g is ergodic. Then for every x G G/T 
the sequence {g'^"xV)n=i,2,... is equidistributed in G/T in the sense that 



lim En^iN]F{gP-xT) = [ F 
for all continuous functions F : G/T — > [—1, 1]. 



Remarks. We recall (from discussions in the companion paper [9]) Leon Green's 
criterion for ergodicity of left-multiplication by g\ this map is ergodic if and only if 
rotation by 'n:{g) is ergodic on the horizontal torus (G'/r)ab, that is to say if and only if 
the entries of 'i^{g) together with 1 are linearly independent over Q. If this is the case 
then left-multiplication by any power of g is uniquely ergodic, that is to say 



lim E„e[^]F((7*"xr) = / F (7.1) 

for all X G G/T and for t = 1, 2, 3, ... . 

Proof of Theorem \7.1\ Let w he a large number and set W := np<ioP- Fix a 



nilmanifold G/T and a continuous (and hence Lipschitz) function F : G/T — )■ [—1,1]. 
Then uniformly in the residues b coprime to W we have 

lim E„e[jv](4P^'(^^ + ^) - l)^(^"^r) = o^_oo(l), (7.2) 

where the convergence is uniform in a; G G/T and g E G. This follows very quickly 
from [HI Proposition 10.2], which was proved under the assumption of the Mobius and 
Nilsequences conjectures MN(s) which we have established in this paper. Recall that 
A'(p) = logp and that A'(n) = if ra is not a prime, that is to say A' is a modified version 
of the von Mangoldt function with no support on the prime powers p'^,p^, . . . . We recall 
that the proof of (17. 2p is quite substantial. One splits the von Mangoldt function A 
in a certain way as the sum of two pieces A" -|- A''. The contribution from the second 
piece is bounded using the MN(s) conjecture, and this is not particularly difficult. The 
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contribution from the first piece is bounded using the machinery of Gowers norms, and 
here one must estimate the dual Gowers norm of the nilsequence F{g^xr) as well as the 
Gowers norm of objects related to A^. This is a substantial amount of work. 

Let us return to the proof at hand. Since f l7.2p is uniform in g and x, we may replace 
g by g^ and x by g'^x to get 

lim E„gp,^(^A'(n) - l)F(^"xr) = o^^U^) 

uniformly for all progressions Pb,w = {Wn + b : n E [N]}, 6 = 0,1,. ..,14^ — 1. However 
it follows from (17. ip that, for fixed b and W, 



lim EneP,^F{g''xT)= f F. 



Comparing these last two expressions we obtain 



lim E„ep,^A'(n)F((7"xr)= / F + o^^^il), 



VV W— >oo J C IT 

uniformly for h coprime to W . Now if h is not coprime to W we obviously have 

^^ lim E„en^A'(n)F((?"a;r) = o^^oo(l) 

since A' is supported on the primes and F is bounded by 1. 
Summing over b, one may conclude that 



lim E,e[i^^]A'HF((7"xr) = / F + o^^^il). 

This is easily seen to imply that 

lim E„e[^]A'(n)F(^"xr) = [ F + o^^^{l). 

The left-hand side no longer depends on tu, so we may let tu — )■ oo. Doing so, we obtain 

lim E„e[^]A'(n)F((7"xr) = f F 

JV^oo Jc,/P 

An easy argument using the prime number theorem, noting that A'(p„) is essentially 
logA^ for almost all primes J9„, n ^ N, concludes the proof. D 

Very straightforward approximation arguments allow one to replace the continuous 
function F by a function with mild discontinuities. In this way one could prove, for 
example, that the sequence p„v^[p„a/2J is uniformly distributed modulo one. We leave 
the details, which are essentially all present in the earlier discussion of n\/3[n\/2\, to 
the reader. 

Appendix A. Mobius and periodic functions 

In this appendix we give the proof of Proposition IA.2[ The argument is, quite apart 
from being completely standard, already contained in ^ Chapter 3]. We nonetheless 
take the opportunity to recall it here, as we wish to emphasise the fact that the main 
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input to this part of the argument is information on the zeros of L-functions. Our 
starting point is the following proposition. 

Proposition A.l. For any A> Q we have 



^ne[N]^l{n)x{n) <a q^'^ log~^ N (A.l) 

for all Dirichlet characters x to modulus q. 

Remark. This follows from the nonexistence of zeros of L(s, x) close to the line 3?s = 1. 
For the details, see [101 Prop. 5.29]. As noted in [lOl p. 124] there are difficulties involved 
in applying the standard Perron's formula approach to E„g[jv]/i(n)x(n) directly, and it 
is rather easier to first obtain bounds on E„g[jv]A(n)x(^)- 

Using standard techniques of harmonic analysis we may obtain the following conse- 
quence of Proposition lA.ll 

Proposition A. 2 (Mobius is orthogonal to periodic sequences). Let f : N ^ C be a 

sequence bounded in magnitude by 1 which is periodic of some period g ^ 1. Then we 
have 

E„e[7v]/i(r^)7H <A glog-^ A 
for all A > 0, where the implied constant is ineffective. 

Proof. We first establish the estimate under the additional assumption that f{n) van- 
ishes whenever {n,q) ^ 1. Then / can be viewed as a function on the multiplicative 
group (Z/gZ) ^ , and thus has a Fourier expansion 

/H = X^/(x)x(^), where /(x) := E„g(a/g2;)x/(n)x(n), 

X 

with X ranging over all the characters on {X/qL) ^ . Applying Proposition lA.ll and the 
triangle inequality, we conclude 

E„e[iv]/iW7R«Ag'/'log"^iV(^|/(x)|). 

X 

But from Cauchy-Schwarz and Plancherel we have 

X X 

where 0(g) := |(Z/gZ)^| is the Euler totient function. Since 0(g) ^ g, the claim follows. 

Now we consider the general case, in which (n, g) is not necessarily equal to 1 on 
the support of /. Observe that if fi{n) is non-zero, then n is square-free, and we can 
split n = dm, where d = {n,q) is square-free (so /i^((i) = 1) and m is coprime to g. 
Furthermore we have fi{n) = fi{d)fi{m). We thus obtain the decomposition 

En^lN]t^{n)f{n) = — Y^ fi{d) ^ fi{m)f{dm)l^rn,q)=i- (A.2) 

The sequence m i— )■ f{dm)l(^rn,q)=i is periodic of period q/d and vanishes whenever 
(m, q/d) 7^ 1, hence by the preceding arguments 

E /i(m) /(rfm) l(„,g)=i <A ^log"^ A. 
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Thus from flA.2p we have 

d\q 

concluding the proof of Proposition IA.2I D 

Appendix B. On tgtal equidistributign gf nilsequences 

During the proof that Proposition 12.11 imphes Theorem 11.11 (specifically, in the claim 
towards the end of ^, we implicitly used the following technical factQ. 

Lemma B.l. Suppose that G/T is an m- dimensional nilmanifold, and suppose that G, 
is a filtration of degree d. Suppose that G/T has a Q-rational Mal'cev basis X adapted 
to the filtration G, . Let G' be a rational subgroup of G which comes together with a 
Mal'cev basis X' for G' /V {where F' = G' n F), in which each element is a Q-rational 
combination of the elements of X . 

Let g' G poly(Z, G,) be a polynomial sequence taking values in G' and suppose that 
{g'{n)r')n(z[N] is totally 6-equidistributed in G'/V . 

Then for every Q-rational 7 G G with coordinates bounded by Q the following is true. 
The conjugate G' := '-f^^G''-/ is a rational subgroup of G which comes together with a 
Mal'cev basis X!^ forG'^/V'^ {where V'^ := G'^HT), in which element is a Q'^"^-'^^^^ -rational 
combination of the elements of X . Furthermore, the conjugate sequence g' := 'y~^g''y, 
which manifestly takes values in G' is Q^^^'-^ 5'^"^-'^ -totally equidistributed in G' /V . 

Proof. This would follow quite straightforwardly from the results in pi Appendix A] if, 
instead of F' , we worked with 7~^F'7. The result as stated follows from this, the lemma 
below and the fact that 7~^F'7 and F' are (^'^""•'''•^^-commensurable. This follows from 
the fact that [F : F fl 7~^F7] ^ Q'^™''^^'', which may in turn be proven by noting that 
the X coordinates of every element in 7~^F7 are rationals over some fixed denominator 
QOm,d(i)^ as follows from P Appendix A]. D 

Lemma B.2. Suppose that G is an m-dimensional simply- connected nilpotent Lie 
group. Let Fi,F2 be two uniform subgroups which are | -commensurable in the sense 
that Fl n F2 has index at most | in both Fi and F2. Suppose that there is a ^-rational 
Mal'cev basis Xi for G/Ti, and that g : Z ^ G is a polynomial sequence of degree 
d with the property that (5f(ra)Fi)„g[jv] is totally 5 -equidistributed with respect to the 
metric dx^- Then there is a Mal'cev basis X2 for G/T2, each element of which is a 
5"^"^'"^^^^ -rational combination of the elements of Xi, and such that {g{n)V2)n<^[N\ is 
totally 6'^"^''^ -equidistributed with respect to the metric dx2- 

Remark. This lemma would be false without the assumption of total equidistribution. 
For example, the sequence {n/N)n£[N] is highly equidistributed on M/Z, but not at all 
close to equidistributed on M/2Z, even though the lattices Z, 2Z are highly commensu- 
rable. 



We note that this fact was absent in our original posting of this paper on the ArXiV in 2008; we 
apologize for this oversight. 
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Proof. By applying the lemma twice (first with Fi and Fi fl F2 and then with Fi fl F2 
and F2), we may reduce to the case where one of Fi,F2 is contained in the other. 
The existence of X2 follows, once again, from the material in [9, Appendix A]. The 
commensurability of Fi and F2 easily implies that the coordinates of F2 with respect to 
Xi are all (5~'^™''**^^^-rational. 

It remains to establish the statement about total equidistribution. If Fi C F2 then 
the fundamental domain for G/Ti is just a union of | copies of that for G/F2, and so 
the result is clear. We may assume, then, that F2 C Fi. Applying the quantitative 
Leibman dichotomy, namely [Qj Theorem 1.16], this may be further reduced to the cor- 
responding abelian statement. Indeed that result immediately implies that ((y'(n)F)„g[jv] 
is totally (5-equidistributed if and only if {7i{g{n)T)n(z[]\[] is totally ^'-equidistributed in 
the horizontal torus (G/F)ab, where the dependence between 6 and 6' in both directions 
of this implication is polynomial of degree 0^,^(1)- 

In the abelian case we have an isomorphism between G/[G, G] and M'"'*'', and we may 
change coordinates so that G/[G,G]ri = Z™'*''. In this setting G/[G,G]r2 is then the 
image of Z"^"^^ under some invertible linear transformation A G GLmj^b(^) with matrix 
coefficients of size S~'^"^><iW_ We have reduced matters, then, to the following assertion. 

Lemma B.3. Letm ^ 1 be a positive integer and suppose that A G GLm(Z) is invertible 
and has entries bounded by | . Let (yf : Z — )■ M™ be a polynomial of degree d, and write 
g' := A o g. Suppose that the sequence {g'{n)Z"^)n^[i\f] is totally 6-equidistributed in 
M™/Z™. Then the sequence {g{n)Z"^)ni^^N] is totally 6' -equidistributed in ]R™/Z™, for 
some 6' = S'^"^''' . 

Proof. We use the quantitative Leibman dichotomy, ^ Theorem 1.16], but now of 
course in the abelian setting. If (5f(ra)Z''")„g[7v] fails to be totally ^'-equidistributed 
then there is some k G Z™-, \k\ ^ §'-Om,dW^ such that \\k ■ g\\c<^[N] ^ S^^^^-^W. This 
implies that \\k' ■ g'\\c--lN] ^ 5'-'^"^-''^^\ where k' := {detA)kA-^; note that k' G Z"' 
and that \k'\ ^ ,5'-c>,„^d(i)_ Recalling the definition of || ■ ||c°°[Af] and passing to an 
appropriate subprogression P C [A^], one sees that {g{n))n^p concentrates near the set 
{x G ]R™'/Z'" : \\k ■ x\\m./z < ^}.This contradicts the assumed total (5-equidistribution of 
{g'{n)Z"^)n£[N], provided that 6' is chosen to be a sufficiently small power of S. D 
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